Overview

Brought to you by YData

Dataset statistics

Number of variables112
Number of observations1926061
Missing cells149451500
Missing cells (%)69.3%
Total size in memory1.6 GiB
Average record size in memory896.0 B

Variable types

Text112

Dataset

DescriptionInvertebrate Zoology NMNH Extant Specimen Records 0052489-241126133413365
URLhttps://doi.org/10.15468/dl.fya67r

Alerts

institutionID has constant value "urn:lsid:biocol.org:col:34871" Constant
collectionID has constant value "urn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6" Constant
institutionCode has constant value "USNM" Constant
collectionCode has constant value "IZ" Constant
datasetName has constant value "NMNH Extant Biology" Constant
associatedReferences has constant value "9" Constant
materialSampleID has constant value "North Pacific Ocean, Gulf Of California" Constant
higherGeographyID has constant value "24.1667" Constant
countryCode has constant value "24 10 00 N" Constant
verticalDatum has constant value "152" Constant
georeferencedBy has constant value "Idaho" Constant
latestAgeOrHighestStage has constant value "Moultrie" Constant
dateIdentified has constant value "-83.7685" Constant
originalNameUsage has constant value "GEOLocate" Constant
subfamily has constant value "47 57 0 N" Constant
tribe has constant value "129 4 0 W" Constant
subtribe has constant value "Seurat, L. G." Constant
cultivarEpithet has constant value "GEOLocate" Constant
nomenclaturalStatus has constant value "Camallanus seurati" Constant
recordNumber has 1804320 (93.7%) missing values Missing
recordedBy has 763966 (39.7%) missing values Missing
sex has 1744021 (90.5%) missing values Missing
lifeStage has 1837065 (95.4%) missing values Missing
occurrenceStatus has 1926059 (> 99.9%) missing values Missing
disposition has 1926059 (> 99.9%) missing values Missing
associatedMedia has 1672204 (86.8%) missing values Missing
associatedOccurrences has 1926059 (> 99.9%) missing values Missing
associatedReferences has 1926059 (> 99.9%) missing values Missing
associatedSequences has 1920937 (99.7%) missing values Missing
occurrenceRemarks has 1144278 (59.4%) missing values Missing
materialEntityRemarks has 1926059 (> 99.9%) missing values Missing
verbatimLabel has 1926059 (> 99.9%) missing values Missing
materialSampleID has 1926060 (> 99.9%) missing values Missing
eventType has 1926059 (> 99.9%) missing values Missing
fieldNumber has 1339537 (69.5%) missing values Missing
eventDate has 684431 (35.5%) missing values Missing
startDayOfYear has 772926 (40.1%) missing values Missing
endDayOfYear has 773095 (40.1%) missing values Missing
year has 684432 (35.5%) missing values Missing
month has 768070 (39.9%) missing values Missing
day has 841840 (43.7%) missing values Missing
verbatimEventDate has 1172997 (60.9%) missing values Missing
habitat has 1856817 (96.4%) missing values Missing
locationID has 983901 (51.1%) missing values Missing
higherGeographyID has 1926060 (> 99.9%) missing values Missing
higherGeography has 67820 (3.5%) missing values Missing
continent has 585602 (30.4%) missing values Missing
waterBody has 666547 (34.6%) missing values Missing
islandGroup has 1925291 (> 99.9%) missing values Missing
island has 1925083 (99.9%) missing values Missing
country has 141874 (7.4%) missing values Missing
countryCode has 1926060 (> 99.9%) missing values Missing
stateProvince has 943504 (49.0%) missing values Missing
county has 1786110 (92.7%) missing values Missing
locality has 642266 (33.3%) missing values Missing
minimumElevationInMeters has 1919257 (99.6%) missing values Missing
maximumElevationInMeters has 1922544 (99.8%) missing values Missing
verbatimElevation has 1925599 (> 99.9%) missing values Missing
verticalDatum has 1926060 (> 99.9%) missing values Missing
minimumDepthInMeters has 1143588 (59.4%) missing values Missing
maximumDepthInMeters has 1205034 (62.6%) missing values Missing
verbatimDepth has 1899821 (98.6%) missing values Missing
decimalLatitude has 927243 (48.1%) missing values Missing
decimalLongitude has 927246 (48.1%) missing values Missing
geodeticDatum has 1858158 (96.5%) missing values Missing
coordinatePrecision has 1926059 (> 99.9%) missing values Missing
pointRadiusSpatialFit has 1926059 (> 99.9%) missing values Missing
verbatimCoordinates has 1926059 (> 99.9%) missing values Missing
verbatimLatitude has 1854408 (96.3%) missing values Missing
verbatimLongitude has 1854475 (96.3%) missing values Missing
verbatimCoordinateSystem has 1246668 (64.7%) missing values Missing
footprintSRS has 1926059 (> 99.9%) missing values Missing
georeferencedBy has 1926060 (> 99.9%) missing values Missing
georeferenceProtocol has 1265567 (65.7%) missing values Missing
georeferenceSources has 1926058 (> 99.9%) missing values Missing
georeferenceRemarks has 1895791 (98.4%) missing values Missing
geologicalContextID has 1926058 (> 99.9%) missing values Missing
earliestEonOrLowestEonothem has 1926058 (> 99.9%) missing values Missing
latestEonOrHighestEonothem has 1926059 (> 99.9%) missing values Missing
earliestEraOrLowestErathem has 1926052 (> 99.9%) missing values Missing
earliestPeriodOrLowestSystem has 1926051 (> 99.9%) missing values Missing
latestPeriodOrHighestSystem has 1926054 (> 99.9%) missing values Missing
earliestEpochOrLowestSeries has 1926050 (> 99.9%) missing values Missing
latestEpochOrHighestSeries has 1926054 (> 99.9%) missing values Missing
earliestAgeOrLowestStage has 1926054 (> 99.9%) missing values Missing
latestAgeOrHighestStage has 1926060 (> 99.9%) missing values Missing
highestBiostratigraphicZone has 1926059 (> 99.9%) missing values Missing
verbatimIdentification has 1926059 (> 99.9%) missing values Missing
identificationQualifier has 1907923 (99.1%) missing values Missing
typeStatus has 1838230 (95.4%) missing values Missing
identifiedBy has 1085026 (56.3%) missing values Missing
identifiedByID has 1926059 (> 99.9%) missing values Missing
dateIdentified has 1926060 (> 99.9%) missing values Missing
identificationReferences has 1926052 (> 99.9%) missing values Missing
identificationRemarks has 1926056 (> 99.9%) missing values Missing
scientificNameID has 1926059 (> 99.9%) missing values Missing
acceptedNameUsageID has 1926053 (> 99.9%) missing values Missing
nameAccordingToID has 1926059 (> 99.9%) missing values Missing
scientificName has 353701 (18.4%) missing values Missing
parentNameUsage has 1926059 (> 99.9%) missing values Missing
originalNameUsage has 1926060 (> 99.9%) missing values Missing
class has 76135 (4.0%) missing values Missing
order has 940799 (48.8%) missing values Missing
family has 191835 (10.0%) missing values Missing
subfamily has 1926060 (> 99.9%) missing values Missing
tribe has 1926060 (> 99.9%) missing values Missing
subtribe has 1926060 (> 99.9%) missing values Missing
genus has 353878 (18.4%) missing values Missing
subgenus has 1813329 (94.1%) missing values Missing
specificEpithet has 353916 (18.4%) missing values Missing
infraspecificEpithet has 1866911 (96.9%) missing values Missing
cultivarEpithet has 1926058 (> 99.9%) missing values Missing
taxonRank has 1866911 (96.9%) missing values Missing
scientificNameAuthorship has 756930 (39.3%) missing values Missing
nomenclaturalCode has 1926059 (> 99.9%) missing values Missing
nomenclaturalStatus has 1926060 (> 99.9%) missing values Missing
gbifID has unique values Unique
occurrenceID has unique values Unique

Reproduction

Analysis started2025-01-14 16:46:22.038176
Analysis finished2025-01-14 16:47:42.199936
Duration1 minute and 20.16 seconds
Software versionydata-profiling vv4.12.1
Download configurationconfig.json

Variables

gbifID
Text

Unique 

Distinct1926061
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:43.633091image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters19260610
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1926061 ?
Unique (%)100.0%

Sample

1st row1321728981
2nd row1320179422
3rd row1320179575
4th row1321729723
5th row1320179846
ValueCountFrequency (%)
1321728981 1
 
< 0.1%
1320183643 1
 
< 0.1%
1321730497 1
 
< 0.1%
1320180949 1
 
< 0.1%
1320181165 1
 
< 0.1%
1456364805 1
 
< 0.1%
1320182209 1
 
< 0.1%
1321732097 1
 
< 0.1%
2571470239 1
 
< 0.1%
1320182449 1
 
< 0.1%
Other values (1926051) 1926051
> 99.9%
2025-01-14T11:47:45.113623image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 3940992
20.5%
3 2929721
15.2%
2 2443540
12.7%
7 1519647
 
7.9%
8 1483597
 
7.7%
0 1475792
 
7.7%
9 1468721
 
7.6%
5 1371139
 
7.1%
6 1316858
 
6.8%
4 1310603
 
6.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 19260610
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 3940992
20.5%
3 2929721
15.2%
2 2443540
12.7%
7 1519647
 
7.9%
8 1483597
 
7.7%
0 1475792
 
7.7%
9 1468721
 
7.6%
5 1371139
 
7.1%
6 1316858
 
6.8%
4 1310603
 
6.8%

Most occurring scripts

ValueCountFrequency (%)
Common 19260610
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 3940992
20.5%
3 2929721
15.2%
2 2443540
12.7%
7 1519647
 
7.9%
8 1483597
 
7.7%
0 1475792
 
7.7%
9 1468721
 
7.6%
5 1371139
 
7.1%
6 1316858
 
6.8%
4 1310603
 
6.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19260610
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 3940992
20.5%
3 2929721
15.2%
2 2443540
12.7%
7 1519647
 
7.9%
8 1483597
 
7.7%
0 1475792
 
7.7%
9 1468721
 
7.6%
5 1371139
 
7.1%
6 1316858
 
6.8%
4 1310603
 
6.8%
Distinct113479
Distinct (%)5.9%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:45.313255image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters36595159
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62369 ?
Unique (%)3.2%

Sample

1st row2021-10-06 15:29:00
2nd row2024-09-25 16:08:00
3rd row2020-01-06 17:42:00
4th row2018-09-17 12:46:00
5th row2024-09-25 15:32:00
ValueCountFrequency (%)
2024-09-25 692724
 
18.0%
2018-09-17 227538
 
5.9%
2019-11-01 80341
 
2.1%
2021-10-06 56982
 
1.5%
2014-10-08 33474
 
0.9%
2014-10-09 25882
 
0.7%
2017-03-29 25186
 
0.7%
2013-01-10 21865
 
0.6%
2024-08-19 19853
 
0.5%
2014-10-20 17831
 
0.5%
Other values (3940) 2650446
68.8%
2025-01-14T11:47:45.561710image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 8969856
24.5%
2 4987650
13.6%
1 4687977
12.8%
- 3852122
10.5%
: 3852122
10.5%
1926061
 
5.3%
4 1757416
 
4.8%
5 1701788
 
4.7%
9 1536715
 
4.2%
3 1149662
 
3.1%
Other values (3) 2173790
 
5.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 26964854
73.7%
Dash Punctuation 3852122
 
10.5%
Other Punctuation 3852122
 
10.5%
Space Separator 1926061
 
5.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 8969856
33.3%
2 4987650
18.5%
1 4687977
17.4%
4 1757416
 
6.5%
5 1701788
 
6.3%
9 1536715
 
5.7%
3 1149662
 
4.3%
7 807635
 
3.0%
6 700968
 
2.6%
8 665187
 
2.5%
Dash Punctuation
ValueCountFrequency (%)
- 3852122
100.0%
Other Punctuation
ValueCountFrequency (%)
: 3852122
100.0%
Space Separator
ValueCountFrequency (%)
1926061
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36595159
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 8969856
24.5%
2 4987650
13.6%
1 4687977
12.8%
- 3852122
10.5%
: 3852122
10.5%
1926061
 
5.3%
4 1757416
 
4.8%
5 1701788
 
4.7%
9 1536715
 
4.2%
3 1149662
 
3.1%
Other values (3) 2173790
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36595159
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 8969856
24.5%
2 4987650
13.6%
1 4687977
12.8%
- 3852122
10.5%
: 3852122
10.5%
1926061
 
5.3%
4 1757416
 
4.8%
5 1701788
 
4.7%
9 1536715
 
4.2%
3 1149662
 
3.1%
Other values (3) 2173790
 
5.9%

institutionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:45.628719image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29
Median length29
Mean length29
Min length29

Characters and Unicode

Total characters55855769
Distinct characters18
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:lsid:biocol.org:col:34871
2nd rowurn:lsid:biocol.org:col:34871
3rd rowurn:lsid:biocol.org:col:34871
4th rowurn:lsid:biocol.org:col:34871
5th rowurn:lsid:biocol.org:col:34871
ValueCountFrequency (%)
urn:lsid:biocol.org:col:34871 1926061
100.0%
2025-01-14T11:47:45.734730image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 7704244
13.8%
: 7704244
13.8%
l 5778183
 
10.3%
i 3852122
 
6.9%
r 3852122
 
6.9%
c 3852122
 
6.9%
g 1926061
 
3.4%
7 1926061
 
3.4%
8 1926061
 
3.4%
4 1926061
 
3.4%
Other values (8) 15408488
27.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 36595159
65.5%
Other Punctuation 9630305
 
17.2%
Decimal Number 9630305
 
17.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 7704244
21.1%
l 5778183
15.8%
i 3852122
10.5%
r 3852122
10.5%
c 3852122
10.5%
g 1926061
 
5.3%
u 1926061
 
5.3%
b 1926061
 
5.3%
d 1926061
 
5.3%
s 1926061
 
5.3%
Decimal Number
ValueCountFrequency (%)
7 1926061
20.0%
8 1926061
20.0%
4 1926061
20.0%
3 1926061
20.0%
1 1926061
20.0%
Other Punctuation
ValueCountFrequency (%)
: 7704244
80.0%
. 1926061
 
20.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 36595159
65.5%
Common 19260610
34.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 7704244
21.1%
l 5778183
15.8%
i 3852122
10.5%
r 3852122
10.5%
c 3852122
10.5%
g 1926061
 
5.3%
u 1926061
 
5.3%
b 1926061
 
5.3%
d 1926061
 
5.3%
s 1926061
 
5.3%
Common
ValueCountFrequency (%)
: 7704244
40.0%
7 1926061
 
10.0%
8 1926061
 
10.0%
4 1926061
 
10.0%
3 1926061
 
10.0%
. 1926061
 
10.0%
1 1926061
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 55855769
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 7704244
13.8%
: 7704244
13.8%
l 5778183
 
10.3%
i 3852122
 
6.9%
r 3852122
 
6.9%
c 3852122
 
6.9%
g 1926061
 
3.4%
7 1926061
 
3.4%
8 1926061
 
3.4%
4 1926061
 
3.4%
Other values (8) 15408488
27.6%

collectionID
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:45.791524image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length45
Median length45
Mean length45
Min length45

Characters and Unicode

Total characters86672745
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
2nd rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
3rd rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
4th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
5th rowurn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6
ValueCountFrequency (%)
urn:uuid:f14c21a9-8cbf-4c8b-817f-d19d427e2dd6 1926061
100.0%
2025-01-14T11:47:45.896047image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
d 9630305
11.1%
1 7704244
 
8.9%
- 7704244
 
8.9%
u 5778183
 
6.7%
8 5778183
 
6.7%
2 5778183
 
6.7%
4 5778183
 
6.7%
c 5778183
 
6.7%
f 5778183
 
6.7%
9 3852122
 
4.4%
Other values (9) 23112732
26.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 40447281
46.7%
Decimal Number 34669098
40.0%
Dash Punctuation 7704244
 
8.9%
Other Punctuation 3852122
 
4.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 9630305
23.8%
u 5778183
14.3%
c 5778183
14.3%
f 5778183
14.3%
b 3852122
 
9.5%
r 1926061
 
4.8%
i 1926061
 
4.8%
a 1926061
 
4.8%
n 1926061
 
4.8%
e 1926061
 
4.8%
Decimal Number
ValueCountFrequency (%)
1 7704244
22.2%
8 5778183
16.7%
2 5778183
16.7%
4 5778183
16.7%
9 3852122
11.1%
7 3852122
11.1%
6 1926061
 
5.6%
Dash Punctuation
ValueCountFrequency (%)
- 7704244
100.0%
Other Punctuation
ValueCountFrequency (%)
: 3852122
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 46225464
53.3%
Latin 40447281
46.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
d 9630305
23.8%
u 5778183
14.3%
c 5778183
14.3%
f 5778183
14.3%
b 3852122
 
9.5%
r 1926061
 
4.8%
i 1926061
 
4.8%
a 1926061
 
4.8%
n 1926061
 
4.8%
e 1926061
 
4.8%
Common
ValueCountFrequency (%)
1 7704244
16.7%
- 7704244
16.7%
8 5778183
12.5%
2 5778183
12.5%
4 5778183
12.5%
9 3852122
8.3%
: 3852122
8.3%
7 3852122
8.3%
6 1926061
 
4.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 86672745
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
d 9630305
11.1%
1 7704244
 
8.9%
- 7704244
 
8.9%
u 5778183
 
6.7%
8 5778183
 
6.7%
2 5778183
 
6.7%
4 5778183
 
6.7%
c 5778183
 
6.7%
f 5778183
 
6.7%
9 3852122
 
4.4%
Other values (9) 23112732
26.7%

institutionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:45.937642image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters7704244
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUSNM
2nd rowUSNM
3rd rowUSNM
4th rowUSNM
5th rowUSNM
ValueCountFrequency (%)
usnm 1926061
100.0%
2025-01-14T11:47:46.030501image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
U 1926061
25.0%
S 1926061
25.0%
N 1926061
25.0%
M 1926061
25.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 7704244
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
U 1926061
25.0%
S 1926061
25.0%
N 1926061
25.0%
M 1926061
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7704244
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
U 1926061
25.0%
S 1926061
25.0%
N 1926061
25.0%
M 1926061
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7704244
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
U 1926061
25.0%
S 1926061
25.0%
N 1926061
25.0%
M 1926061
25.0%

collectionCode
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:46.070508image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters3852122
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowIZ
2nd rowIZ
3rd rowIZ
4th rowIZ
5th rowIZ
ValueCountFrequency (%)
iz 1926061
100.0%
2025-01-14T11:47:46.161663image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 1926061
50.0%
Z 1926061
50.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3852122
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
I 1926061
50.0%
Z 1926061
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 3852122
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1926061
50.0%
Z 1926061
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3852122
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1926061
50.0%
Z 1926061
50.0%

datasetName
Text

Constant 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:46.204368image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length19
Mean length19
Min length19

Characters and Unicode

Total characters36595159
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNMNH Extant Biology
2nd rowNMNH Extant Biology
3rd rowNMNH Extant Biology
4th rowNMNH Extant Biology
5th rowNMNH Extant Biology
ValueCountFrequency (%)
nmnh 1926061
33.3%
extant 1926061
33.3%
biology 1926061
33.3%
2025-01-14T11:47:46.302346image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
N 3852122
 
10.5%
3852122
 
10.5%
t 3852122
 
10.5%
o 3852122
 
10.5%
M 1926061
 
5.3%
H 1926061
 
5.3%
E 1926061
 
5.3%
x 1926061
 
5.3%
a 1926061
 
5.3%
n 1926061
 
5.3%
Other values (5) 9630305
26.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 21186671
57.9%
Uppercase Letter 11556366
31.6%
Space Separator 3852122
 
10.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3852122
18.2%
o 3852122
18.2%
x 1926061
9.1%
a 1926061
9.1%
n 1926061
9.1%
i 1926061
9.1%
l 1926061
9.1%
g 1926061
9.1%
y 1926061
9.1%
Uppercase Letter
ValueCountFrequency (%)
N 3852122
33.3%
M 1926061
16.7%
H 1926061
16.7%
E 1926061
16.7%
B 1926061
16.7%
Space Separator
ValueCountFrequency (%)
3852122
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32743037
89.5%
Common 3852122
 
10.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 3852122
11.8%
t 3852122
11.8%
o 3852122
11.8%
M 1926061
 
5.9%
H 1926061
 
5.9%
E 1926061
 
5.9%
x 1926061
 
5.9%
a 1926061
 
5.9%
n 1926061
 
5.9%
B 1926061
 
5.9%
Other values (4) 7704244
23.5%
Common
ValueCountFrequency (%)
3852122
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36595159
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N 3852122
 
10.5%
3852122
 
10.5%
t 3852122
 
10.5%
o 3852122
 
10.5%
M 1926061
 
5.3%
H 1926061
 
5.3%
E 1926061
 
5.3%
x 1926061
 
5.3%
a 1926061
 
5.3%
n 1926061
 
5.3%
Other values (5) 9630305
26.3%
Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:46.354007image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length17
Mean length17.00144025
Min length16

Characters and Unicode

Total characters32745811
Distinct characters21
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreservedSpecimen
2nd rowPreservedSpecimen
3rd rowPreservedSpecimen
4th rowPreservedSpecimen
5th rowPreservedSpecimen
ValueCountFrequency (%)
preservedspecimen 1921925
99.8%
machineobservation 3455
 
0.2%
humanobservation 681
 
< 0.1%
2025-01-14T11:47:46.469643image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 9617216
29.4%
r 3847986
11.8%
n 1930197
 
5.9%
i 1929516
 
5.9%
s 1926061
 
5.9%
v 1926061
 
5.9%
c 1925380
 
5.9%
m 1922606
 
5.9%
P 1921925
 
5.9%
p 1921925
 
5.9%
Other values (11) 3876938
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 28893689
88.2%
Uppercase Letter 3852122
 
11.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 9617216
33.3%
r 3847986
13.3%
n 1930197
 
6.7%
i 1929516
 
6.7%
s 1926061
 
6.7%
v 1926061
 
6.7%
c 1925380
 
6.7%
m 1922606
 
6.7%
p 1921925
 
6.7%
d 1921925
 
6.7%
Other values (6) 24816
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
P 1921925
49.9%
S 1921925
49.9%
O 4136
 
0.1%
M 3455
 
0.1%
H 681
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin 32745811
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 9617216
29.4%
r 3847986
11.8%
n 1930197
 
5.9%
i 1929516
 
5.9%
s 1926061
 
5.9%
v 1926061
 
5.9%
c 1925380
 
5.9%
m 1922606
 
5.9%
P 1921925
 
5.9%
p 1921925
 
5.9%
Other values (11) 3876938
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32745811
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 9617216
29.4%
r 3847986
11.8%
n 1930197
 
5.9%
i 1929516
 
5.9%
s 1926061
 
5.9%
v 1926061
 
5.9%
c 1925380
 
5.9%
m 1922606
 
5.9%
P 1921925
 
5.9%
p 1921925
 
5.9%
Other values (11) 3876938
11.8%

occurrenceID
Text

Unique 

Distinct1926061
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size14.7 MiB
2025-01-14T11:47:47.759799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length63
Median length63
Mean length63
Min length63

Characters and Unicode

Total characters121341843
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1926061 ?
Unique (%)100.0%

Sample

1st rowhttp://n2t.net/ark:/65665/3c831e8df-8799-47a1-8dcf-bcb0b77fd3e3
2nd rowhttp://n2t.net/ark:/65665/383ab647e-23a7-4086-b71e-36212ccc0eb2
3rd rowhttp://n2t.net/ark:/65665/383adbf6e-f769-4dc3-8bef-550530af49ee
4th rowhttp://n2t.net/ark:/65665/3c83aad38-c935-46fa-96c3-e450ebb169cf
5th rowhttp://n2t.net/ark:/65665/383b126a6-bf3a-4908-bc33-e4435555fcc5
ValueCountFrequency (%)
http://n2t.net/ark:/65665/3c831e8df-8799-47a1-8dcf-bcb0b77fd3e3 1
 
< 0.1%
http://n2t.net/ark:/65665/383db58fb-5d8c-4076-bec7-fa6e28ed98a7 1
 
< 0.1%
http://n2t.net/ark:/65665/3c843fd56-7874-4858-b938-14fdfcb5544c 1
 
< 0.1%
http://n2t.net/ark:/65665/383bcb698-5477-4feb-9966-d9adae345f09 1
 
< 0.1%
http://n2t.net/ark:/65665/383bfd766-40bc-4ede-82ca-0df3775130f3 1
 
< 0.1%
http://n2t.net/ark:/65665/3c84cf22c-2b9b-49fb-91ed-f85efd9e9fa7 1
 
< 0.1%
http://n2t.net/ark:/65665/383cb8e2a-4f46-4138-82be-3d7989851c9e 1
 
< 0.1%
http://n2t.net/ark:/65665/3c856104b-9825-44b9-8b57-e69b58510bf8 1
 
< 0.1%
http://n2t.net/ark:/65665/3c856ef4e-b135-45c8-8511-c533777f0d7a 1
 
< 0.1%
http://n2t.net/ark:/65665/383ce04ed-5cd8-4a05-90df-39eccc31a990 1
 
< 0.1%
Other values (1926051) 1926051
> 99.9%
2025-01-14T11:47:49.124089image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 9630305
 
7.9%
6 9392699
 
7.7%
- 7704244
 
6.3%
t 7704244
 
6.3%
5 7459909
 
6.1%
a 6017570
 
5.0%
3 5538537
 
4.6%
e 5536681
 
4.6%
2 5536450
 
4.6%
4 5533592
 
4.6%
Other values (16) 51287612
42.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 52484647
43.3%
Lowercase Letter 45744464
37.7%
Other Punctuation 15408488
 
12.7%
Dash Punctuation 7704244
 
6.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 7704244
16.8%
a 6017570
13.2%
e 5536681
12.1%
b 4094745
9.0%
n 3852122
8.4%
d 3614893
7.9%
c 3611051
7.9%
f 3608914
7.9%
k 1926061
 
4.2%
r 1926061
 
4.2%
Other values (2) 3852122
8.4%
Decimal Number
ValueCountFrequency (%)
6 9392699
17.9%
5 7459909
14.2%
3 5538537
10.6%
2 5536450
10.5%
4 5533592
10.5%
8 4094794
7.8%
9 4094577
7.8%
1 3613193
 
6.9%
7 3610777
 
6.9%
0 3610119
 
6.9%
Other Punctuation
ValueCountFrequency (%)
/ 9630305
62.5%
: 3852122
 
25.0%
. 1926061
 
12.5%
Dash Punctuation
ValueCountFrequency (%)
- 7704244
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 75597379
62.3%
Latin 45744464
37.7%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 9630305
12.7%
6 9392699
12.4%
- 7704244
10.2%
5 7459909
9.9%
3 5538537
7.3%
2 5536450
7.3%
4 5533592
7.3%
8 4094794
 
5.4%
9 4094577
 
5.4%
: 3852122
 
5.1%
Other values (4) 12760150
16.9%
Latin
ValueCountFrequency (%)
t 7704244
16.8%
a 6017570
13.2%
e 5536681
12.1%
b 4094745
9.0%
n 3852122
8.4%
d 3614893
7.9%
c 3611051
7.9%
f 3608914
7.9%
k 1926061
 
4.2%
r 1926061
 
4.2%
Other values (2) 3852122
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 121341843
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 9630305
 
7.9%
6 9392699
 
7.7%
- 7704244
 
6.3%
t 7704244
 
6.3%
5 7459909
 
6.1%
a 6017570
 
5.0%
3 5538537
 
4.6%
e 5536681
 
4.6%
2 5536450
 
4.6%
4 5533592
 
4.6%
Other values (16) 51287612
42.3%
Distinct1355224
Distinct (%)70.4%
Missing5
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-14T11:47:49.937422image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length11
Mean length11.0374122
Min length6

Characters and Unicode

Total characters21258674
Distinct characters63
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1024370 ?
Unique (%)53.2%

Sample

1st rowUSNM 1119015
2nd rowUSNM 55168
3rd rowUSNM 52536
4th rowUSNM E40844
5th rowUSNM 241160
ValueCountFrequency (%)
usnm 1926056
50.0%
31
 
< 0.1%
284908 16
 
< 0.1%
653324 13
 
< 0.1%
5357 11
 
< 0.1%
15490 10
 
< 0.1%
859036 10
 
< 0.1%
224878 10
 
< 0.1%
22869 10
 
< 0.1%
284377 9
 
< 0.1%
Other values (1351980) 1925969
50.0%
2025-01-14T11:47:50.788431image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 1928174
 
9.1%
U 1926163
 
9.1%
1926089
 
9.1%
S 1926056
 
9.1%
N 1926056
 
9.1%
1 1809561
 
8.5%
2 1247347
 
5.9%
3 1147683
 
5.4%
4 1110632
 
5.2%
5 1088174
 
5.1%
Other values (53) 5222739
24.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 11555577
54.4%
Uppercase Letter 7762060
36.5%
Space Separator 1926089
 
9.1%
Lowercase Letter 11685
 
0.1%
Other Punctuation 3259
 
< 0.1%
Dash Punctuation 2
 
< 0.1%
Close Punctuation 1
 
< 0.1%
Open Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8272
70.8%
b 1738
 
14.9%
c 637
 
5.5%
d 326
 
2.8%
e 206
 
1.8%
f 143
 
1.2%
g 87
 
0.7%
h 61
 
0.5%
i 40
 
0.3%
j 35
 
0.3%
Other values (16) 140
 
1.2%
Uppercase Letter
ValueCountFrequency (%)
M 1928174
24.8%
U 1926163
24.8%
S 1926056
24.8%
N 1926056
24.8%
E 53442
 
0.7%
I 778
 
< 0.1%
A 697
 
< 0.1%
X 326
 
< 0.1%
B 177
 
< 0.1%
D 128
 
< 0.1%
Other values (10) 63
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1809561
15.7%
2 1247347
10.8%
3 1147683
9.9%
4 1110632
9.6%
5 1088174
9.4%
8 1073263
9.3%
6 1062173
9.2%
7 1058767
9.2%
0 1001934
8.7%
9 956043
8.3%
Other Punctuation
ValueCountFrequency (%)
* 3252
99.8%
. 6
 
0.2%
& 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1926089
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 2
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 13484929
63.4%
Latin 7773745
36.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1928174
24.8%
U 1926163
24.8%
S 1926056
24.8%
N 1926056
24.8%
E 53442
 
0.7%
a 8272
 
0.1%
b 1738
 
< 0.1%
I 778
 
< 0.1%
A 697
 
< 0.1%
c 637
 
< 0.1%
Other values (36) 1732
 
< 0.1%
Common
ValueCountFrequency (%)
1926089
14.3%
1 1809561
13.4%
2 1247347
9.2%
3 1147683
8.5%
4 1110632
8.2%
5 1088174
8.1%
8 1073263
8.0%
6 1062173
7.9%
7 1058767
7.9%
0 1001934
7.4%
Other values (7) 959306
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21258674
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1928174
 
9.1%
U 1926163
 
9.1%
1926089
 
9.1%
S 1926056
 
9.1%
N 1926056
 
9.1%
1 1809561
 
8.5%
2 1247347
 
5.9%
3 1147683
 
5.4%
4 1110632
 
5.2%
5 1088174
 
5.1%
Other values (53) 5222739
24.6%

recordNumber
Text

Missing 

Distinct119483
Distinct (%)98.1%
Missing1804320
Missing (%)93.7%
Memory size14.7 MiB
2025-01-14T11:47:51.029187image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length14
Mean length13.17353234
Min length1

Characters and Unicode

Total characters1603759
Distinct characters81
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique118854 ?
Unique (%)97.6%

Sample

1st rowUSNPC # 001298
2nd rowFPlrv_430
3rd rowH-2284
4th rowUSNPC # 066527
5th rowUSNPC # 009815
ValueCountFrequency (%)
88136
28.7%
usnpc 88055
28.6%
ullz 5209
 
1.7%
rh 1566
 
0.5%
k-rh 1554
 
0.5%
ce16007-event 223
 
0.1%
2208 102
 
< 0.1%
1430 92
 
< 0.1%
1513 80
 
< 0.1%
beauty 75
 
< 0.1%
Other values (119402) 122305
39.8%
2025-01-14T11:47:51.327029image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
185656
 
11.6%
0 161160
 
10.0%
C 97548
 
6.1%
S 95220
 
5.9%
U 94859
 
5.9%
P 94137
 
5.9%
N 93444
 
5.8%
# 88212
 
5.5%
1 82997
 
5.2%
2 65144
 
4.1%
Other values (71) 545382
34.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 709617
44.2%
Uppercase Letter 576479
35.9%
Space Separator 185656
 
11.6%
Other Punctuation 91627
 
5.7%
Dash Punctuation 15239
 
1.0%
Connector Punctuation 14089
 
0.9%
Lowercase Letter 10490
 
0.7%
Close Punctuation 281
 
< 0.1%
Open Punctuation 271
 
< 0.1%
Math Symbol 10
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 97548
16.9%
S 95220
16.5%
U 94859
16.5%
P 94137
16.3%
N 93444
16.2%
L 12317
 
2.1%
E 11806
 
2.0%
R 10315
 
1.8%
I 7528
 
1.3%
B 7241
 
1.3%
Other values (16) 52064
9.0%
Lowercase Letter
ValueCountFrequency (%)
l 1416
13.5%
v 1363
13.0%
a 1349
12.9%
r 1268
12.1%
t 873
8.3%
e 713
6.8%
s 657
 
6.3%
n 489
 
4.7%
c 300
 
2.9%
i 287
 
2.7%
Other values (16) 1775
16.9%
Decimal Number
ValueCountFrequency (%)
0 161160
22.7%
1 82997
11.7%
2 65144
9.2%
6 58922
 
8.3%
3 58881
 
8.3%
7 58482
 
8.2%
4 56680
 
8.0%
8 56221
 
7.9%
9 55917
 
7.9%
5 55213
 
7.8%
Other Punctuation
ValueCountFrequency (%)
# 88212
96.3%
. 2351
 
2.6%
: 559
 
0.6%
, 400
 
0.4%
; 65
 
0.1%
/ 20
 
< 0.1%
& 10
 
< 0.1%
? 7
 
< 0.1%
* 3
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 15238
> 99.9%
1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 273
97.2%
] 8
 
2.8%
Open Punctuation
ValueCountFrequency (%)
( 263
97.0%
[ 8
 
3.0%
Math Symbol
ValueCountFrequency (%)
+ 5
50.0%
= 5
50.0%
Space Separator
ValueCountFrequency (%)
185656
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 14089
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1016790
63.4%
Latin 586969
36.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 97548
16.6%
S 95220
16.2%
U 94859
16.2%
P 94137
16.0%
N 93444
15.9%
L 12317
 
2.1%
E 11806
 
2.0%
R 10315
 
1.8%
I 7528
 
1.3%
B 7241
 
1.2%
Other values (42) 62554
10.7%
Common
ValueCountFrequency (%)
185656
18.3%
0 161160
15.8%
# 88212
8.7%
1 82997
8.2%
2 65144
 
6.4%
6 58922
 
5.8%
3 58881
 
5.8%
7 58482
 
5.8%
4 56680
 
5.6%
8 56221
 
5.5%
Other values (19) 144435
14.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1603758
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
185656
 
11.6%
0 161160
 
10.0%
C 97548
 
6.1%
S 95220
 
5.9%
U 94859
 
5.9%
P 94137
 
5.9%
N 93444
 
5.8%
# 88212
 
5.5%
1 82997
 
5.2%
2 65144
 
4.1%
Other values (70) 545381
34.0%
Punctuation
ValueCountFrequency (%)
1
100.0%

recordedBy
Text

Missing 

Distinct37538
Distinct (%)3.2%
Missing763966
Missing (%)39.7%
Memory size14.7 MiB
2025-01-14T11:47:51.530993image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length15905
Median length156
Mean length23.04389228
Min length1

Characters and Unicode

Total characters26779192
Distinct characters99
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16586 ?
Unique (%)1.4%

Sample

1st rowVIMS for BLM/ MMS
2nd rowLgl Ecological Research Associates/ Environmental Science And Engineering For BLM/ MMS
3rd rowUniversity of Southern California
4th rowUnited States Fish Commission
5th rowUnited States Fish Commission
ValueCountFrequency (%)
mms 180985
 
4.2%
blm 180983
 
4.2%
for 178027
 
4.2%
fish 168335
 
3.9%
united 164134
 
3.8%
states 163470
 
3.8%
commission 157053
 
3.7%
149555
 
3.5%
of 101735
 
2.4%
j 101445
 
2.4%
Other values (19699) 2735993
63.9%
2025-01-14T11:47:51.829376image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3118394
 
11.6%
e 2082026
 
7.8%
i 1878666
 
7.0%
n 1615750
 
6.0%
t 1592137
 
5.9%
o 1549174
 
5.8%
s 1529432
 
5.7%
a 1498632
 
5.6%
r 1220911
 
4.6%
M 808430
 
3.0%
Other values (89) 9885640
36.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17586035
65.7%
Uppercase Letter 4861930
 
18.2%
Space Separator 3118394
 
11.6%
Other Punctuation 1144505
 
4.3%
Dash Punctuation 53401
 
0.2%
Decimal Number 6866
 
< 0.1%
Control 6698
 
< 0.1%
Open Punctuation 669
 
< 0.1%
Close Punctuation 669
 
< 0.1%
Math Symbol 25
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2082026
11.8%
i 1878666
10.7%
n 1615750
9.2%
t 1592137
9.1%
o 1549174
8.8%
s 1529432
8.7%
a 1498632
8.5%
r 1220911
 
6.9%
l 767614
 
4.4%
h 563615
 
3.2%
Other values (31) 3288078
18.7%
Uppercase Letter
ValueCountFrequency (%)
M 808430
16.6%
S 653530
13.4%
B 397798
 
8.2%
C 364653
 
7.5%
F 349172
 
7.2%
L 335721
 
6.9%
U 267026
 
5.5%
H 212474
 
4.4%
R 188625
 
3.9%
W 154318
 
3.2%
Other values (17) 1130183
23.2%
Other Punctuation
ValueCountFrequency (%)
. 741714
64.8%
/ 238270
 
20.8%
& 118056
 
10.3%
, 45635
 
4.0%
' 383
 
< 0.1%
: 366
 
< 0.1%
" 36
 
< 0.1%
; 26
 
< 0.1%
? 15
 
< 0.1%
# 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1289
18.8%
2 1273
18.5%
0 1071
15.6%
9 974
14.2%
4 474
 
6.9%
6 442
 
6.4%
8 366
 
5.3%
3 348
 
5.1%
7 334
 
4.9%
5 295
 
4.3%
Control
ValueCountFrequency (%)
6663
99.5%
35
 
0.5%
Open Punctuation
ValueCountFrequency (%)
( 667
99.7%
{ 2
 
0.3%
Close Punctuation
ValueCountFrequency (%)
) 667
99.7%
} 2
 
0.3%
Math Symbol
ValueCountFrequency (%)
+ 21
84.0%
= 4
 
16.0%
Space Separator
ValueCountFrequency (%)
3118394
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 53401
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22447965
83.8%
Common 4331227
 
16.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2082026
 
9.3%
i 1878666
 
8.4%
n 1615750
 
7.2%
t 1592137
 
7.1%
o 1549174
 
6.9%
s 1529432
 
6.8%
a 1498632
 
6.7%
r 1220911
 
5.4%
M 808430
 
3.6%
l 767614
 
3.4%
Other values (58) 7905193
35.2%
Common
ValueCountFrequency (%)
3118394
72.0%
. 741714
 
17.1%
/ 238270
 
5.5%
& 118056
 
2.7%
- 53401
 
1.2%
, 45635
 
1.1%
6663
 
0.2%
1 1289
 
< 0.1%
2 1273
 
< 0.1%
0 1071
 
< 0.1%
Other values (21) 5461
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 26778271
> 99.9%
None 921
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3118394
 
11.6%
e 2082026
 
7.8%
i 1878666
 
7.0%
n 1615750
 
6.0%
t 1592137
 
5.9%
o 1549174
 
5.8%
s 1529432
 
5.7%
a 1498632
 
5.6%
r 1220911
 
4.6%
M 808430
 
3.0%
Other values (73) 9884719
36.9%
None
ValueCountFrequency (%)
é 455
49.4%
ü 102
 
11.1%
á 93
 
10.1%
ö 65
 
7.1%
ä 57
 
6.2%
ó 53
 
5.8%
í 49
 
5.3%
è 14
 
1.5%
ñ 12
 
1.3%
ç 9
 
1.0%
Other values (6) 12
 
1.3%
Distinct1067
Distinct (%)0.1%
Missing156
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-14T11:47:52.009813image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length1
Mean length1.10839112
Min length1

Characters and Unicode

Total characters2134656
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique413 ?
Unique (%)< 0.1%

Sample

1st row1
2nd row11
3rd row1
4th row26
5th row1
ValueCountFrequency (%)
1 995615
51.7%
2 289522
 
15.0%
3 135746
 
7.0%
4 99091
 
5.1%
5 73915
 
3.8%
6 51736
 
2.7%
10 38942
 
2.0%
7 31367
 
1.6%
8 30163
 
1.6%
9 18498
 
1.0%
Other values (1057) 161310
 
8.4%
2025-01-14T11:47:52.253721image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1131407
53.0%
2 345437
 
16.2%
3 162113
 
7.6%
4 118945
 
5.6%
5 110267
 
5.2%
0 93489
 
4.4%
6 64558
 
3.0%
7 42168
 
2.0%
8 40048
 
1.9%
9 26224
 
1.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2134656
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1131407
53.0%
2 345437
 
16.2%
3 162113
 
7.6%
4 118945
 
5.6%
5 110267
 
5.2%
0 93489
 
4.4%
6 64558
 
3.0%
7 42168
 
2.0%
8 40048
 
1.9%
9 26224
 
1.2%

Most occurring scripts

ValueCountFrequency (%)
Common 2134656
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1131407
53.0%
2 345437
 
16.2%
3 162113
 
7.6%
4 118945
 
5.6%
5 110267
 
5.2%
0 93489
 
4.4%
6 64558
 
3.0%
7 42168
 
2.0%
8 40048
 
1.9%
9 26224
 
1.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2134656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1131407
53.0%
2 345437
 
16.2%
3 162113
 
7.6%
4 118945
 
5.6%
5 110267
 
5.2%
0 93489
 
4.4%
6 64558
 
3.0%
7 42168
 
2.0%
8 40048
 
1.9%
9 26224
 
1.2%

sex
Text

Missing 

Distinct299
Distinct (%)0.2%
Missing1744021
Missing (%)90.5%
Memory size14.7 MiB
2025-01-14T11:47:52.307470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length130
Median length76
Mean length8.258635465
Min length4

Characters and Unicode

Total characters1503402
Distinct characters25
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique137 ?
Unique (%)0.1%

Sample

1st rowfemale
2nd rowfemale
3rd rowmale; female
4th rowmale
5th rowmale
ValueCountFrequency (%)
female 137569
52.7%
male 121519
46.5%
unknown 1423
 
0.5%
hermaphrodite 267
 
0.1%
224
 
0.1%
intersex 146
 
0.1%
male/female 101
 
< 0.1%
female/male 9
 
< 0.1%
neuter 1
 
< 0.1%
imposex 1
 
< 0.1%
2025-01-14T11:47:52.436243image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 397816
26.5%
a 259575
17.3%
l 259308
17.2%
m 253777
16.9%
f 128855
 
8.6%
; 96869
 
6.4%
79220
 
5.3%
F 8824
 
0.6%
M 5799
 
0.4%
n 4416
 
0.3%
Other values (15) 8943
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1311268
87.2%
Other Punctuation 96979
 
6.5%
Space Separator 79220
 
5.3%
Uppercase Letter 15935
 
1.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 397816
30.3%
a 259575
19.8%
l 259308
19.8%
m 253777
19.4%
f 128855
 
9.8%
n 4416
 
0.3%
o 1691
 
0.1%
k 1423
 
0.1%
w 1423
 
0.1%
r 681
 
0.1%
Other values (8) 2303
 
0.2%
Uppercase Letter
ValueCountFrequency (%)
F 8824
55.4%
M 5799
36.4%
U 1306
 
8.2%
I 6
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 96869
99.9%
/ 110
 
0.1%
Space Separator
ValueCountFrequency (%)
79220
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1327203
88.3%
Common 176199
 
11.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 397816
30.0%
a 259575
19.6%
l 259308
19.5%
m 253777
19.1%
f 128855
 
9.7%
F 8824
 
0.7%
M 5799
 
0.4%
n 4416
 
0.3%
o 1691
 
0.1%
k 1423
 
0.1%
Other values (12) 5719
 
0.4%
Common
ValueCountFrequency (%)
; 96869
55.0%
79220
45.0%
/ 110
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1503402
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 397816
26.5%
a 259575
17.3%
l 259308
17.2%
m 253777
16.9%
f 128855
 
8.6%
; 96869
 
6.4%
79220
 
5.3%
F 8824
 
0.6%
M 5799
 
0.4%
n 4416
 
0.3%
Other values (15) 8943
 
0.6%

lifeStage
Text

Missing 

Distinct852
Distinct (%)1.0%
Missing1837065
Missing (%)95.4%
Memory size14.7 MiB
2025-01-14T11:47:52.587081image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length97
Median length76
Mean length9.342240101
Min length1

Characters and Unicode

Total characters831422
Distinct characters50
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique377 ?
Unique (%)0.4%

Sample

1st rowovigerous
2nd rowI
3rd rowlarva
4th rowjuvenile
5th rowlarvae
ValueCountFrequency (%)
juvenile 43771
33.5%
16544
 
12.7%
ovigerous 15621
 
12.0%
adult 15324
 
11.7%
ii 11920
 
9.1%
i 9497
 
7.3%
larvae 7056
 
5.4%
immature 1741
 
1.3%
larva 1318
 
1.0%
copepodid 666
 
0.5%
Other values (173) 7154
 
5.5%
2025-01-14T11:47:52.808592image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 117680
14.2%
u 78513
9.4%
l 69686
 
8.4%
; 68834
 
8.3%
v 68052
 
8.2%
i 64284
 
7.7%
n 45375
 
5.5%
j 43457
 
5.2%
41616
 
5.0%
a 40329
 
4.9%
Other values (40) 193596
23.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 685260
82.4%
Other Punctuation 68876
 
8.3%
Space Separator 41616
 
5.0%
Uppercase Letter 35133
 
4.2%
Dash Punctuation 292
 
< 0.1%
Decimal Number 245
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 117680
17.2%
u 78513
11.5%
l 69686
10.2%
v 68052
9.9%
i 64284
9.4%
n 45375
 
6.6%
j 43457
 
6.3%
a 40329
 
5.9%
o 36937
 
5.4%
r 29747
 
4.3%
Other values (16) 91200
13.3%
Uppercase Letter
ValueCountFrequency (%)
I 33827
96.3%
V 527
 
1.5%
J 356
 
1.0%
A 302
 
0.9%
C 45
 
0.1%
X 22
 
0.1%
L 21
 
0.1%
M 19
 
0.1%
P 14
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
; 68834
99.9%
' 19
 
< 0.1%
& 13
 
< 0.1%
. 4
 
< 0.1%
, 4
 
< 0.1%
/ 1
 
< 0.1%
? 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 116
47.3%
2 62
25.3%
3 34
 
13.9%
4 24
 
9.8%
5 7
 
2.9%
6 2
 
0.8%
Space Separator
ValueCountFrequency (%)
41616
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 292
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 720393
86.6%
Common 111029
 
13.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 117680
16.3%
u 78513
10.9%
l 69686
9.7%
v 68052
9.4%
i 64284
8.9%
n 45375
 
6.3%
j 43457
 
6.0%
a 40329
 
5.6%
o 36937
 
5.1%
I 33827
 
4.7%
Other values (25) 122253
17.0%
Common
ValueCountFrequency (%)
; 68834
62.0%
41616
37.5%
- 292
 
0.3%
1 116
 
0.1%
2 62
 
0.1%
3 34
 
< 0.1%
4 24
 
< 0.1%
' 19
 
< 0.1%
& 13
 
< 0.1%
5 7
 
< 0.1%
Other values (5) 12
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 831403
> 99.9%
None 19
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 117680
14.2%
u 78513
9.4%
l 69686
 
8.4%
; 68834
 
8.3%
v 68052
 
8.2%
i 64284
 
7.7%
n 45375
 
5.5%
j 43457
 
5.2%
41616
 
5.0%
a 40329
 
4.9%
Other values (39) 193577
23.3%
None
ValueCountFrequency (%)
ü 19
100.0%

occurrenceStatus
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:53.163052image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20
Distinct characters7
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1993-09-09
2nd row1938-09-22
ValueCountFrequency (%)
1993-09-09 1
50.0%
1938-09-22 1
50.0%
2025-01-14T11:47:53.262803image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 6
30.0%
- 4
20.0%
0 3
15.0%
1 2
 
10.0%
3 2
 
10.0%
2 2
 
10.0%
8 1
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 16
80.0%
Dash Punctuation 4
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 6
37.5%
0 3
18.8%
1 2
 
12.5%
3 2
 
12.5%
2 2
 
12.5%
8 1
 
6.2%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 20
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 6
30.0%
- 4
20.0%
0 3
15.0%
1 2
 
10.0%
3 2
 
10.0%
2 2
 
10.0%
8 1
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 6
30.0%
- 4
20.0%
0 3
15.0%
1 2
 
10.0%
3 2
 
10.0%
2 2
 
10.0%
8 1
 
5.0%
Distinct527
Distinct (%)< 0.1%
Missing1860
Missing (%)0.1%
Memory size14.7 MiB
2025-01-14T11:47:53.333350image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length167
Median length157
Mean length10.12227257
Min length3

Characters and Unicode

Total characters19477287
Distinct characters53
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique212 ?
Unique (%)< 0.1%

Sample

1st rowAlcohol (Ethanol)
2nd rowDry
3rd rowAlcohol (Ethanol)
4th rowDry
5th rowDry
ValueCountFrequency (%)
ethanol 906962
30.8%
dry 902181
30.6%
alcohol 897467
30.5%
slide 129625
 
4.4%
19547
 
0.7%
95 16839
 
0.6%
formalin 12584
 
0.4%
biorepository 12371
 
0.4%
isopropyl 10052
 
0.3%
sorting 6035
 
0.2%
Other values (40) 31868
 
1.1%
2025-01-14T11:47:53.477059image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 2865932
14.7%
o 2796698
14.4%
h 1805994
 
9.3%
1021330
 
5.2%
r 954159
 
4.9%
t 939400
 
4.8%
n 936694
 
4.8%
a 925585
 
4.8%
y 923820
 
4.7%
E 912862
 
4.7%
Other values (43) 5394813
27.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13610860
69.9%
Uppercase Letter 2924613
 
15.0%
Space Separator 1021330
 
5.2%
Close Punctuation 887415
 
4.6%
Open Punctuation 887415
 
4.6%
Other Punctuation 86944
 
0.4%
Decimal Number 39163
 
0.2%
Dash Punctuation 19547
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 2865932
21.1%
o 2796698
20.5%
h 1805994
13.3%
r 954159
 
7.0%
t 939400
 
6.9%
n 936694
 
6.9%
a 925585
 
6.8%
y 923820
 
6.8%
c 898488
 
6.6%
i 181329
 
1.3%
Other values (12) 382761
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
E 912862
31.2%
D 902455
30.9%
A 898567
30.7%
S 153297
 
5.2%
I 13800
 
0.5%
F 12983
 
0.4%
B 12729
 
0.4%
M 5938
 
0.2%
R 4592
 
0.2%
Y 4591
 
0.2%
Other values (9) 2799
 
0.1%
Decimal Number
ValueCountFrequency (%)
9 18431
47.1%
5 17781
45.4%
0 1802
 
4.6%
8 1080
 
2.8%
1 36
 
0.1%
2 33
 
0.1%
Other Punctuation
ValueCountFrequency (%)
; 67396
77.5%
% 19548
 
22.5%
Space Separator
ValueCountFrequency (%)
1021330
100.0%
Close Punctuation
ValueCountFrequency (%)
) 887415
100.0%
Open Punctuation
ValueCountFrequency (%)
( 887415
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 19547
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 16535473
84.9%
Common 2941814
 
15.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 2865932
17.3%
o 2796698
16.9%
h 1805994
10.9%
r 954159
 
5.8%
t 939400
 
5.7%
n 936694
 
5.7%
a 925585
 
5.6%
y 923820
 
5.6%
E 912862
 
5.5%
D 902455
 
5.5%
Other values (31) 2571874
15.6%
Common
ValueCountFrequency (%)
1021330
34.7%
) 887415
30.2%
( 887415
30.2%
; 67396
 
2.3%
% 19548
 
0.7%
- 19547
 
0.7%
9 18431
 
0.6%
5 17781
 
0.6%
0 1802
 
0.1%
8 1080
 
< 0.1%
Other values (2) 69
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19477287
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 2865932
14.7%
o 2796698
14.4%
h 1805994
 
9.3%
1021330
 
5.2%
r 954159
 
4.9%
t 939400
 
4.8%
n 936694
 
4.8%
a 925585
 
4.8%
y 923820
 
4.7%
E 912862
 
4.7%
Other values (43) 5394813
27.7%

disposition
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:53.524668image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row252
2nd row265
ValueCountFrequency (%)
252 1
50.0%
265 1
50.0%
2025-01-14T11:47:53.615811image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3
50.0%
5 2
33.3%
6 1
 
16.7%

associatedMedia
Text

Missing 

Distinct242386
Distinct (%)95.5%
Missing1672204
Missing (%)86.8%
Memory size14.7 MiB
2025-01-14T11:47:53.841558image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1629
Median length49
Mean length50.86072868
Min length3

Characters and Unicode

Total characters12911352
Distinct characters31
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique241663 ?
Unique (%)95.2%

Sample

1st rowhttps://collections.nmnh.si.edu/media/?i=12038700
2nd rowhttps://collections.nmnh.si.edu/media/?i=16053651
3rd rowhttps://collections.nmnh.si.edu/media/?i=18190
4th rowhttps://collections.nmnh.si.edu/media/?i=55934
5th rowhttps://collections.nmnh.si.edu/media/?i=10165617
ValueCountFrequency (%)
https://collections.nmnh.si.edu/media/?i=10674432 1623
 
0.5%
https://collections.nmnh.si.edu/media/?i=10689696 1456
 
0.4%
https://collections.nmnh.si.edu/media/?i=10696300 1243
 
0.4%
https://collections.nmnh.si.edu/media/?i=10684813 919
 
0.3%
https://collections.nmnh.si.edu/media/?i=10669453 853
 
0.3%
https://collections.nmnh.si.edu/media/?i=10643018 690
 
0.2%
https://collections.nmnh.si.edu/media/?i=10676407 540
 
0.2%
https://collections.nmnh.si.edu/media/?i=11455178 456
 
0.1%
https://collections.nmnh.si.edu/media/?i=10865403 387
 
0.1%
https://collections.nmnh.si.edu/media/?i=10803950 271
 
0.1%
Other values (311642) 318373
97.4%
2025-01-14T11:47:54.133083image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 1015420
 
7.9%
/ 1015420
 
7.9%
t 761565
 
5.9%
s 761565
 
5.9%
. 761565
 
5.9%
n 761565
 
5.9%
e 761565
 
5.9%
h 507710
 
3.9%
d 507710
 
3.9%
m 507710
 
3.9%
Other values (21) 5549557
43.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7869505
61.0%
Other Punctuation 2357649
 
18.3%
Decimal Number 2357389
 
18.3%
Math Symbol 253855
 
2.0%
Space Separator 72954
 
0.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 1015420
12.9%
t 761565
9.7%
s 761565
9.7%
n 761565
9.7%
e 761565
9.7%
h 507710
 
6.5%
d 507710
 
6.5%
m 507710
 
6.5%
l 507710
 
6.5%
o 507710
 
6.5%
Other values (4) 1269275
16.1%
Decimal Number
ValueCountFrequency (%)
1 430632
18.3%
5 257203
10.9%
4 232520
9.9%
6 226636
9.6%
0 219805
9.3%
8 208267
8.8%
3 206069
8.7%
2 202946
8.6%
7 194481
8.2%
9 178830
7.6%
Other Punctuation
ValueCountFrequency (%)
/ 1015420
43.1%
. 761565
32.3%
? 253855
 
10.8%
: 253855
 
10.8%
; 72954
 
3.1%
Math Symbol
ValueCountFrequency (%)
= 253855
100.0%
Space Separator
ValueCountFrequency (%)
72954
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 7869505
61.0%
Common 5041847
39.0%

Most frequent character per script

Common
ValueCountFrequency (%)
/ 1015420
20.1%
. 761565
15.1%
1 430632
 
8.5%
5 257203
 
5.1%
? 253855
 
5.0%
= 253855
 
5.0%
: 253855
 
5.0%
4 232520
 
4.6%
6 226636
 
4.5%
0 219805
 
4.4%
Other values (7) 1136501
22.5%
Latin
ValueCountFrequency (%)
i 1015420
12.9%
t 761565
9.7%
s 761565
9.7%
n 761565
9.7%
e 761565
9.7%
h 507710
 
6.5%
d 507710
 
6.5%
m 507710
 
6.5%
l 507710
 
6.5%
o 507710
 
6.5%
Other values (4) 1269275
16.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12911352
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 1015420
 
7.9%
/ 1015420
 
7.9%
t 761565
 
5.9%
s 761565
 
5.9%
. 761565
 
5.9%
n 761565
 
5.9%
e 761565
 
5.9%
h 507710
 
3.9%
d 507710
 
3.9%
m 507710
 
3.9%
Other values (21) 5549557
43.0%

associatedOccurrences
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:54.183852image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1993
2nd row1938
ValueCountFrequency (%)
1993 1
50.0%
1938 1
50.0%
2025-01-14T11:47:54.282553image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 3
37.5%
1 2
25.0%
3 2
25.0%
8 1
 
12.5%

associatedReferences
Text

Constant  Missing 

Distinct1
Distinct (%)50.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:54.326756image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters2
Distinct characters1
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row9
2nd row9
ValueCountFrequency (%)
9 2
100.0%
2025-01-14T11:47:54.417760image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 2
100.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
9 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
9 2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
9 2
100.0%

associatedSequences
Text

Missing 

Distinct5099
Distinct (%)99.5%
Missing1920937
Missing (%)99.7%
Memory size14.7 MiB
2025-01-14T11:47:54.507430image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1349
Median length49
Mean length85.49824356
Min length1

Characters and Unicode

Total characters438093
Distinct characters61
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5084 ?
Unique (%)99.2%

Sample

1st rowhttps://www.ncbi.nlm.nih.gov/gquery?term=AY426351|https://www.ncbi.nlm.nih.gov/gquery?term=AY379442|https://www.ncbi.nlm.nih.gov/gquery?term=AY426385
2nd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MH825989
3rd rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MT223244
4th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=MH826372
5th rowhttps://www.ncbi.nlm.nih.gov/gquery?term=KT792656
ValueCountFrequency (%)
https://www.ncbi.nlm.nih.gov/gquery?term=km521547 12
 
0.2%
https://www.ncbi.nlm.nih.gov/gquery?term=ay643524 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ef060028|https://www.ncbi.nlm.nih.gov/gquery?term=kx362271 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=fj172481 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=kx832080 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=srr9613700 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=jq307001 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=ku285912 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=eu863366|https://www.ncbi.nlm.nih.gov/gquery?term=eu863300 2
 
< 0.1%
https://www.ncbi.nlm.nih.gov/gquery?term=mh244118 2
 
< 0.1%
Other values (5089) 5094
99.4%
2025-01-14T11:47:54.678594image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 35419
 
8.1%
t 26562
 
6.1%
/ 26562
 
6.1%
w 26562
 
6.1%
n 26562
 
6.1%
h 17708
 
4.0%
r 17708
 
4.0%
i 17708
 
4.0%
e 17708
 
4.0%
m 17708
 
4.0%
Other values (51) 207886
47.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 274474
62.7%
Other Punctuation 79689
 
18.2%
Decimal Number 53458
 
12.2%
Uppercase Letter 17884
 
4.1%
Math Symbol 12586
 
2.9%
Dash Punctuation 1
 
< 0.1%
Connector Punctuation 1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
K 3906
21.8%
M 3764
21.0%
W 1587
8.9%
U 1539
 
8.6%
F 833
 
4.7%
J 772
 
4.3%
X 719
 
4.0%
C 697
 
3.9%
T 538
 
3.0%
H 533
 
3.0%
Other values (14) 2996
16.8%
Lowercase Letter
ValueCountFrequency (%)
t 26562
 
9.7%
w 26562
 
9.7%
n 26562
 
9.7%
h 17708
 
6.5%
r 17708
 
6.5%
i 17708
 
6.5%
e 17708
 
6.5%
m 17708
 
6.5%
g 17708
 
6.5%
q 8854
 
3.2%
Other values (9) 79686
29.0%
Decimal Number
ValueCountFrequency (%)
2 7336
13.7%
8 6190
11.6%
0 5590
10.5%
4 5209
9.7%
6 5207
9.7%
5 5041
9.4%
3 4920
9.2%
9 4838
9.1%
1 4744
8.9%
7 4383
8.2%
Other Punctuation
ValueCountFrequency (%)
. 35419
44.4%
/ 26562
33.3%
? 8854
 
11.1%
: 8854
 
11.1%
Math Symbol
ValueCountFrequency (%)
= 8854
70.3%
| 3732
29.7%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 292358
66.7%
Common 145735
33.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 26562
 
9.1%
w 26562
 
9.1%
n 26562
 
9.1%
h 17708
 
6.1%
r 17708
 
6.1%
i 17708
 
6.1%
e 17708
 
6.1%
m 17708
 
6.1%
g 17708
 
6.1%
q 8854
 
3.0%
Other values (33) 97570
33.4%
Common
ValueCountFrequency (%)
. 35419
24.3%
/ 26562
18.2%
= 8854
 
6.1%
? 8854
 
6.1%
: 8854
 
6.1%
2 7336
 
5.0%
8 6190
 
4.2%
0 5590
 
3.8%
4 5209
 
3.6%
6 5207
 
3.6%
Other values (8) 27660
19.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 438093
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 35419
 
8.1%
t 26562
 
6.1%
/ 26562
 
6.1%
w 26562
 
6.1%
n 26562
 
6.1%
h 17708
 
4.0%
r 17708
 
4.0%
i 17708
 
4.0%
e 17708
 
4.0%
m 17708
 
4.0%
Other values (51) 207886
47.5%

occurrenceRemarks
Text

Missing 

Distinct384844
Distinct (%)49.2%
Missing1144278
Missing (%)59.4%
Memory size14.7 MiB
2025-01-14T11:47:55.005482image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length29813
Median length1371
Mean length61.44131172
Min length1

Characters and Unicode

Total characters48033773
Distinct characters133
Distinct categories18 ?
Distinct scripts3 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique322638 ?
Unique (%)41.3%

Sample

1st rowJewett.; Stearns.
2nd rowBartsch
3rd row15 Nov. 1973; Jones, Dawson, del Rosario; Fitzgerald; NMNH-STRI Survey
4th rowU. S. B. Fish
5th rowC.R. Laws
ValueCountFrequency (%)
coll 143172
 
2.1%
of 115241
 
1.7%
and 111346
 
1.7%
a 107275
 
1.6%
by 89596
 
1.3%
87789
 
1.3%
2 65611
 
1.0%
3 63122
 
0.9%
was 62148
 
0.9%
formalin 58887
 
0.9%
Other values (237636) 5772371
86.5%
2025-01-14T11:47:55.437844image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5890993
 
12.3%
e 2965089
 
6.2%
o 2600658
 
5.4%
a 2412401
 
5.0%
i 2008687
 
4.2%
t 1976952
 
4.1%
n 1974652
 
4.1%
r 1876623
 
3.9%
s 1857069
 
3.9%
l 1811612
 
3.8%
Other values (123) 22659037
47.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27315643
56.9%
Space Separator 5890993
 
12.3%
Uppercase Letter 5675734
 
11.8%
Other Punctuation 4998736
 
10.4%
Decimal Number 3433710
 
7.1%
Dash Punctuation 298658
 
0.6%
Open Punctuation 185583
 
0.4%
Close Punctuation 185432
 
0.4%
Control 21090
 
< 0.1%
Math Symbol 15137
 
< 0.1%
Other values (8) 13057
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 2965089
10.9%
o 2600658
 
9.5%
a 2412401
 
8.8%
i 2008687
 
7.4%
t 1976952
 
7.2%
n 1974652
 
7.2%
r 1876623
 
6.9%
s 1857069
 
6.8%
l 1811612
 
6.6%
d 1161297
 
4.3%
Other values (32) 6670603
24.4%
Uppercase Letter
ValueCountFrequency (%)
C 695582
 
12.3%
S 674934
 
11.9%
B 358975
 
6.3%
F 347135
 
6.1%
P 325526
 
5.7%
N 311183
 
5.5%
M 289244
 
5.1%
A 261990
 
4.6%
R 238530
 
4.2%
H 231655
 
4.1%
Other values (17) 1940980
34.2%
Other Punctuation
ValueCountFrequency (%)
" 1192398
23.9%
. 1191661
23.8%
; 1043726
20.9%
, 582173
11.6%
: 567803
11.4%
% 166876
 
3.3%
/ 97178
 
1.9%
! 65383
 
1.3%
' 33797
 
0.7%
# 25846
 
0.5%
Other values (6) 31895
 
0.6%
Decimal Number
ValueCountFrequency (%)
1 684528
19.9%
2 446207
13.0%
9 386796
11.3%
0 370228
10.8%
3 301861
8.8%
7 286303
8.3%
5 255519
 
7.4%
6 251045
 
7.3%
4 238239
 
6.9%
8 212984
 
6.2%
Math Symbol
ValueCountFrequency (%)
+ 11234
74.2%
= 2004
 
13.2%
| 1638
 
10.8%
> 140
 
0.9%
~ 94
 
0.6%
< 23
 
0.2%
± 2
 
< 0.1%
× 2
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 3563
96.0%
91
 
2.5%
49
 
1.3%
6
 
0.2%
© 2
 
0.1%
® 1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 298188
99.8%
469
 
0.2%
1
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 95133
51.3%
{ 87809
47.3%
[ 2641
 
1.4%
Close Punctuation
ValueCountFrequency (%)
) 95004
51.2%
} 87803
47.4%
] 2625
 
1.4%
Other Number
ValueCountFrequency (%)
½ 1
33.3%
¼ 1
33.3%
³ 1
33.3%
Control
ValueCountFrequency (%)
20979
99.5%
111
 
0.5%
Currency Symbol
ValueCountFrequency (%)
$ 383
99.5%
2
 
0.5%
Final Punctuation
ValueCountFrequency (%)
213
99.5%
» 1
 
0.5%
Initial Punctuation
ValueCountFrequency (%)
213
99.5%
« 1
 
0.5%
Space Separator
ValueCountFrequency (%)
5890993
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7399
100.0%
Other Letter
ValueCountFrequency (%)
º 1128
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32992469
68.7%
Common 15041296
31.3%
Greek 8
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 2965089
 
9.0%
o 2600658
 
7.9%
a 2412401
 
7.3%
i 2008687
 
6.1%
t 1976952
 
6.0%
n 1974652
 
6.0%
r 1876623
 
5.7%
s 1857069
 
5.6%
l 1811612
 
5.5%
d 1161297
 
3.5%
Other values (57) 12347429
37.4%
Common
ValueCountFrequency (%)
5890993
39.2%
" 1192398
 
7.9%
. 1191661
 
7.9%
; 1043726
 
6.9%
1 684528
 
4.6%
, 582173
 
3.9%
: 567803
 
3.8%
2 446207
 
3.0%
9 386796
 
2.6%
0 370228
 
2.5%
Other values (54) 2684783
17.8%
Greek
ValueCountFrequency (%)
μ 7
87.5%
π 1
 
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 48027274
> 99.9%
None 5313
 
< 0.1%
Punctuation 1038
 
< 0.1%
Misc Symbols 146
 
< 0.1%
Currency Symbols 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5890993
 
12.3%
e 2965089
 
6.2%
o 2600658
 
5.4%
a 2412401
 
5.0%
i 2008687
 
4.2%
t 1976952
 
4.1%
n 1974652
 
4.1%
r 1876623
 
3.9%
s 1857069
 
3.9%
l 1811612
 
3.8%
Other values (86) 22652538
47.2%
None
ValueCountFrequency (%)
° 3563
67.1%
º 1128
 
21.2%
é 384
 
7.2%
ü 87
 
1.6%
µ 28
 
0.5%
ö 28
 
0.5%
ã 14
 
0.3%
à 12
 
0.2%
ó 11
 
0.2%
á 9
 
0.2%
Other values (18) 49
 
0.9%
Punctuation
ValueCountFrequency (%)
469
45.2%
213
20.5%
213
20.5%
142
 
13.7%
1
 
0.1%
Misc Symbols
ValueCountFrequency (%)
91
62.3%
49
33.6%
6
 
4.1%
Currency Symbols
ValueCountFrequency (%)
2
100.0%

materialEntityRemarks
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:55.507067image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length48.5
Mean length48.5
Min length35

Characters and Unicode

Total characters97
Distinct characters28
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowNorth America, North Pacific Ocean, Gulf Of California, Mexico
2nd rowNorth America, United States, Texas
ValueCountFrequency (%)
north 3
21.4%
america 2
14.3%
pacific 1
 
7.1%
ocean 1
 
7.1%
gulf 1
 
7.1%
of 1
 
7.1%
california 1
 
7.1%
mexico 1
 
7.1%
united 1
 
7.1%
states 1
 
7.1%
2025-01-14T11:47:55.624886image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
12
 
12.4%
a 8
 
8.2%
i 8
 
8.2%
e 7
 
7.2%
r 6
 
6.2%
t 6
 
6.2%
c 6
 
6.2%
o 5
 
5.2%
, 5
 
5.2%
f 4
 
4.1%
Other values (18) 30
30.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 66
68.0%
Uppercase Letter 14
 
14.4%
Space Separator 12
 
12.4%
Other Punctuation 5
 
5.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
12.1%
i 8
12.1%
e 7
10.6%
r 6
9.1%
t 6
9.1%
c 6
9.1%
o 5
7.6%
f 4
 
6.1%
n 3
 
4.5%
h 3
 
4.5%
Other values (6) 10
15.2%
Uppercase Letter
ValueCountFrequency (%)
N 3
21.4%
A 2
14.3%
O 2
14.3%
P 1
 
7.1%
G 1
 
7.1%
C 1
 
7.1%
M 1
 
7.1%
U 1
 
7.1%
S 1
 
7.1%
T 1
 
7.1%
Space Separator
ValueCountFrequency (%)
12
100.0%
Other Punctuation
ValueCountFrequency (%)
, 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 80
82.5%
Common 17
 
17.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
 
10.0%
i 8
 
10.0%
e 7
 
8.8%
r 6
 
7.5%
t 6
 
7.5%
c 6
 
7.5%
o 5
 
6.2%
f 4
 
5.0%
n 3
 
3.8%
N 3
 
3.8%
Other values (16) 24
30.0%
Common
ValueCountFrequency (%)
12
70.6%
, 5
29.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 97
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
12
 
12.4%
a 8
 
8.2%
i 8
 
8.2%
e 7
 
7.2%
r 6
 
6.2%
t 6
 
6.2%
c 6
 
6.2%
o 5
 
5.2%
, 5
 
5.2%
f 4
 
4.1%
Other values (18) 30
30.9%

verbatimLabel
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:55.678500image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length23.5
Mean length23.5
Min length13

Characters and Unicode

Total characters47
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowNorth America, North Pacific Ocean
2nd rowNorth America
ValueCountFrequency (%)
north 3
42.9%
america 2
28.6%
pacific 1
 
14.3%
ocean 1
 
14.3%
2025-01-14T11:47:55.782040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 5
10.6%
c 5
10.6%
5
10.6%
a 4
8.5%
i 4
8.5%
N 3
 
6.4%
o 3
 
6.4%
e 3
 
6.4%
h 3
 
6.4%
t 3
 
6.4%
Other values (7) 9
19.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 34
72.3%
Uppercase Letter 7
 
14.9%
Space Separator 5
 
10.6%
Other Punctuation 1
 
2.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 5
14.7%
c 5
14.7%
a 4
11.8%
i 4
11.8%
o 3
8.8%
e 3
8.8%
h 3
8.8%
t 3
8.8%
m 2
 
5.9%
f 1
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
N 3
42.9%
A 2
28.6%
P 1
 
14.3%
O 1
 
14.3%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 41
87.2%
Common 6
 
12.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 5
12.2%
c 5
12.2%
a 4
9.8%
i 4
9.8%
N 3
7.3%
o 3
7.3%
e 3
7.3%
h 3
7.3%
t 3
7.3%
m 2
 
4.9%
Other values (5) 6
14.6%
Common
ValueCountFrequency (%)
5
83.3%
, 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 5
10.6%
c 5
10.6%
5
10.6%
a 4
8.5%
i 4
8.5%
N 3
 
6.4%
o 3
 
6.4%
e 3
 
6.4%
h 3
 
6.4%
t 3
 
6.4%
Other values (7) 9
19.1%

materialSampleID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:55.839159image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length39
Median length39
Mean length39
Min length39

Characters and Unicode

Total characters39
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowNorth Pacific Ocean, Gulf Of California
ValueCountFrequency (%)
north 1
16.7%
pacific 1
16.7%
ocean 1
16.7%
gulf 1
16.7%
of 1
16.7%
california 1
16.7%
2025-01-14T11:47:55.958499image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
5
12.8%
i 4
10.3%
a 4
10.3%
f 4
10.3%
c 3
 
7.7%
n 2
 
5.1%
r 2
 
5.1%
l 2
 
5.1%
o 2
 
5.1%
O 2
 
5.1%
Other values (9) 9
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 27
69.2%
Uppercase Letter 6
 
15.4%
Space Separator 5
 
12.8%
Other Punctuation 1
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
14.8%
a 4
14.8%
f 4
14.8%
c 3
11.1%
n 2
7.4%
r 2
7.4%
l 2
7.4%
o 2
7.4%
u 1
 
3.7%
e 1
 
3.7%
Other values (2) 2
7.4%
Uppercase Letter
ValueCountFrequency (%)
O 2
33.3%
G 1
16.7%
N 1
16.7%
P 1
16.7%
C 1
16.7%
Space Separator
ValueCountFrequency (%)
5
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 33
84.6%
Common 6
 
15.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
12.1%
a 4
12.1%
f 4
12.1%
c 3
9.1%
n 2
 
6.1%
r 2
 
6.1%
l 2
 
6.1%
o 2
 
6.1%
O 2
 
6.1%
u 1
 
3.0%
Other values (7) 7
21.2%
Common
ValueCountFrequency (%)
5
83.3%
, 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 39
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
5
12.8%
i 4
10.3%
a 4
10.3%
f 4
10.3%
c 3
 
7.7%
n 2
 
5.1%
r 2
 
5.1%
l 2
 
5.1%
o 2
 
5.1%
O 2
 
5.1%
Other values (9) 9
23.1%

eventType
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:47:56.006027image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length9.5
Mean length9.5
Min length6

Characters and Unicode

Total characters19
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowMexico
2nd rowUnited States
ValueCountFrequency (%)
mexico 1
33.3%
united 1
33.3%
states 1
33.3%
2025-01-14T11:47:56.117224image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3
15.8%
t 3
15.8%
i 2
10.5%
M 1
 
5.3%
x 1
 
5.3%
c 1
 
5.3%
o 1
 
5.3%
U 1
 
5.3%
n 1
 
5.3%
d 1
 
5.3%
Other values (4) 4
21.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
78.9%
Uppercase Letter 3
 
15.8%
Space Separator 1
 
5.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3
20.0%
t 3
20.0%
i 2
13.3%
x 1
 
6.7%
c 1
 
6.7%
o 1
 
6.7%
n 1
 
6.7%
d 1
 
6.7%
a 1
 
6.7%
s 1
 
6.7%
Uppercase Letter
ValueCountFrequency (%)
M 1
33.3%
U 1
33.3%
S 1
33.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18
94.7%
Common 1
 
5.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3
16.7%
t 3
16.7%
i 2
11.1%
M 1
 
5.6%
x 1
 
5.6%
c 1
 
5.6%
o 1
 
5.6%
U 1
 
5.6%
n 1
 
5.6%
d 1
 
5.6%
Other values (3) 3
16.7%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3
15.8%
t 3
15.8%
i 2
10.5%
M 1
 
5.3%
x 1
 
5.3%
c 1
 
5.3%
o 1
 
5.3%
U 1
 
5.3%
n 1
 
5.3%
d 1
 
5.3%
Other values (4) 4
21.1%

fieldNumber
Text

Missing 

Distinct62645
Distinct (%)10.7%
Missing1339537
Missing (%)69.5%
Memory size14.7 MiB
2025-01-14T11:47:56.339210image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length111
Median length63
Mean length13.61587079
Min length1

Characters and Unicode

Total characters7986035
Distinct characters82
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27485 ?
Unique (%)4.7%

Sample

1st rowMMS-CABP/02B-E4
2nd row4/III-23-TDS
3rd rowUSARP/EL/12/1002/USC
4th rowUSFC/A2059
5th rowUSFC/A5374
ValueCountFrequency (%)
mms-mafla/jar 17287
 
2.6%
bolland/rfb 7604
 
1.1%
humes 5242
 
0.8%
jpem 5029
 
0.8%
4975
 
0.8%
rh 2306
 
0.3%
k-rh 1556
 
0.2%
spm 1163
 
0.2%
mnhn-norfolk 1131
 
0.2%
haul 1039
 
0.2%
Other values (59081) 614323
92.8%
2025-01-14T11:47:56.656010image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
/ 742625
 
9.3%
S 650584
 
8.1%
M 501285
 
6.3%
- 479984
 
6.0%
A 421779
 
5.3%
1 403168
 
5.0%
0 377764
 
4.7%
C 368103
 
4.6%
2 360900
 
4.5%
U 266483
 
3.3%
Other values (72) 3413360
42.7%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 3901644
48.9%
Decimal Number 2536481
31.8%
Other Punctuation 835531
 
10.5%
Dash Punctuation 479984
 
6.0%
Lowercase Letter 145874
 
1.8%
Space Separator 75131
 
0.9%
Connector Punctuation 7570
 
0.1%
Open Punctuation 1756
 
< 0.1%
Close Punctuation 1756
 
< 0.1%
Math Symbol 302
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
S 650584
16.7%
M 501285
12.8%
A 421779
10.8%
C 368103
9.4%
U 266483
 
6.8%
F 236148
 
6.1%
I 186834
 
4.8%
R 170590
 
4.4%
L 169956
 
4.4%
P 165581
 
4.2%
Other values (16) 764301
19.6%
Lowercase Letter
ValueCountFrequency (%)
e 25302
17.3%
r 24949
17.1%
a 23100
15.8%
l 9448
 
6.5%
s 8103
 
5.6%
i 7885
 
5.4%
o 7864
 
5.4%
u 7557
 
5.2%
m 5785
 
4.0%
t 4696
 
3.2%
Other values (16) 21185
14.5%
Other Punctuation
ValueCountFrequency (%)
/ 742625
88.9%
: 80839
 
9.7%
. 4233
 
0.5%
; 3671
 
0.4%
, 2634
 
0.3%
# 938
 
0.1%
\ 340
 
< 0.1%
? 150
 
< 0.1%
& 61
 
< 0.1%
" 16
 
< 0.1%
Other values (2) 24
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 403168
15.9%
0 377764
14.9%
2 360900
14.2%
5 260694
10.3%
3 252335
9.9%
4 217269
8.6%
7 192296
7.6%
6 178137
7.0%
8 164674
6.5%
9 129244
 
5.1%
Math Symbol
ValueCountFrequency (%)
+ 290
96.0%
= 12
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
- 479984
100.0%
Space Separator
ValueCountFrequency (%)
75131
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7570
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1756
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1756
100.0%
Control
ValueCountFrequency (%)
 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 4047518
50.7%
Common 3938517
49.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 650584
16.1%
M 501285
12.4%
A 421779
10.4%
C 368103
9.1%
U 266483
 
6.6%
F 236148
 
5.8%
I 186834
 
4.6%
R 170590
 
4.2%
L 169956
 
4.2%
P 165581
 
4.1%
Other values (42) 910175
22.5%
Common
ValueCountFrequency (%)
/ 742625
18.9%
- 479984
12.2%
1 403168
10.2%
0 377764
9.6%
2 360900
9.2%
5 260694
 
6.6%
3 252335
 
6.4%
4 217269
 
5.5%
7 192296
 
4.9%
6 178137
 
4.5%
Other values (20) 473345
12.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7986035
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
/ 742625
 
9.3%
S 650584
 
8.1%
M 501285
 
6.3%
- 479984
 
6.0%
A 421779
 
5.3%
1 403168
 
5.0%
0 377764
 
4.7%
C 368103
 
4.6%
2 360900
 
4.5%
U 266483
 
3.3%
Other values (72) 3413360
42.7%

eventDate
Text

Missing 

Distinct46451
Distinct (%)3.7%
Missing684431
Missing (%)35.5%
Memory size14.7 MiB
2025-01-14T11:47:56.863978image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length24
Median length10
Mean length9.847445696
Min length4

Characters and Unicode

Total characters12226884
Distinct characters21
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7284 ?
Unique (%)0.6%

Sample

1st row1976-03-03
2nd row1984-05-15
3rd row1964-03-15
4th row1883-08-31
5th row1909-03-02
ValueCountFrequency (%)
1915 6240
 
0.5%
1982-07-21 5683
 
0.5%
1981-07-06 5412
 
0.4%
1983-05-13 5155
 
0.4%
1982-11-19 5037
 
0.4%
1982-02-10 4461
 
0.4%
1981-11-09 4296
 
0.3%
1913 4289
 
0.3%
1982-05-10 4268
 
0.3%
1977-01-28/1977-02-13 3795
 
0.3%
Other values (46407) 1193140
96.1%
2025-01-14T11:47:57.139532image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2356589
19.3%
- 2334518
19.1%
0 1811279
14.8%
9 1510678
12.4%
2 833024
 
6.8%
8 784802
 
6.4%
7 719423
 
5.9%
6 566996
 
4.6%
5 438603
 
3.6%
3 432985
 
3.5%
Other values (11) 437987
 
3.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 9840824
80.5%
Dash Punctuation 2334518
 
19.1%
Other Punctuation 51245
 
0.4%
Lowercase Letter 150
 
< 0.1%
Space Separator 146
 
< 0.1%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2356589
23.9%
0 1811279
18.4%
9 1510678
15.4%
2 833024
 
8.5%
8 784802
 
8.0%
7 719423
 
7.3%
6 566996
 
5.8%
5 438603
 
4.5%
3 432985
 
4.4%
4 386445
 
3.9%
Lowercase Letter
ValueCountFrequency (%)
o 73
48.7%
r 73
48.7%
e 1
 
0.7%
x 1
 
0.7%
a 1
 
0.7%
s 1
 
0.7%
Other Punctuation
ValueCountFrequency (%)
/ 50988
99.5%
, 257
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 2334518
100.0%
Space Separator
ValueCountFrequency (%)
146
100.0%
Uppercase Letter
ValueCountFrequency (%)
T 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 12226733
> 99.9%
Latin 151
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2356589
19.3%
- 2334518
19.1%
0 1811279
14.8%
9 1510678
12.4%
2 833024
 
6.8%
8 784802
 
6.4%
7 719423
 
5.9%
6 566996
 
4.6%
5 438603
 
3.6%
3 432985
 
3.5%
Other values (4) 437836
 
3.6%
Latin
ValueCountFrequency (%)
o 73
48.3%
r 73
48.3%
T 1
 
0.7%
e 1
 
0.7%
x 1
 
0.7%
a 1
 
0.7%
s 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12226884
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2356589
19.3%
- 2334518
19.1%
0 1811279
14.8%
9 1510678
12.4%
2 833024
 
6.8%
8 784802
 
6.4%
7 719423
 
5.9%
6 566996
 
4.6%
5 438603
 
3.6%
3 432985
 
3.5%
Other values (11) 437987
 
3.6%

startDayOfYear
Text

Missing 

Distinct366
Distinct (%)< 0.1%
Missing772926
Missing (%)40.1%
Memory size14.7 MiB
2025-01-14T11:47:57.352607image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length2.745095761
Min length1

Characters and Unicode

Total characters3165466
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row63
2nd row136
3rd row75
4th row243
5th row61
ValueCountFrequency (%)
243 12547
 
1.1%
334 10327
 
0.9%
151 9378
 
0.8%
202 9211
 
0.8%
133 9049
 
0.8%
212 8665
 
0.8%
187 8345
 
0.7%
130 7951
 
0.7%
323 7924
 
0.7%
41 7863
 
0.7%
Other values (356) 1061875
92.1%
2025-01-14T11:47:57.622268image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 622454
19.7%
2 590934
18.7%
3 457326
14.4%
4 256897
8.1%
5 233467
 
7.4%
0 218600
 
6.9%
6 207670
 
6.6%
9 203223
 
6.4%
7 193100
 
6.1%
8 181795
 
5.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3165466
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 622454
19.7%
2 590934
18.7%
3 457326
14.4%
4 256897
8.1%
5 233467
 
7.4%
0 218600
 
6.9%
6 207670
 
6.6%
9 203223
 
6.4%
7 193100
 
6.1%
8 181795
 
5.7%

Most occurring scripts

ValueCountFrequency (%)
Common 3165466
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 622454
19.7%
2 590934
18.7%
3 457326
14.4%
4 256897
8.1%
5 233467
 
7.4%
0 218600
 
6.9%
6 207670
 
6.6%
9 203223
 
6.4%
7 193100
 
6.1%
8 181795
 
5.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3165466
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 622454
19.7%
2 590934
18.7%
3 457326
14.4%
4 256897
8.1%
5 233467
 
7.4%
0 218600
 
6.9%
6 207670
 
6.6%
9 203223
 
6.4%
7 193100
 
6.1%
8 181795
 
5.7%

endDayOfYear
Text

Missing 

Distinct368
Distinct (%)< 0.1%
Missing773095
Missing (%)40.1%
Memory size14.7 MiB
2025-01-14T11:47:57.826827image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length3
Mean length2.745963021
Min length1

Characters and Unicode

Total characters3166002
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st row63
2nd row136
3rd row75
4th row243
5th row61
ValueCountFrequency (%)
243 12439
 
1.1%
334 10162
 
0.9%
151 9376
 
0.8%
202 9186
 
0.8%
133 9037
 
0.8%
212 8808
 
0.8%
187 8348
 
0.7%
41 7969
 
0.7%
323 7922
 
0.7%
130 7868
 
0.7%
Other values (360) 1061853
92.1%
2025-01-14T11:47:58.087541image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 626592
19.8%
2 588319
18.6%
3 458636
14.5%
4 260045
8.2%
5 235565
 
7.4%
0 220401
 
7.0%
6 202442
 
6.4%
9 198660
 
6.3%
7 192143
 
6.1%
8 183183
 
5.8%
Other values (10) 16
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3165986
> 99.9%
Lowercase Letter 10
 
< 0.1%
Uppercase Letter 4
 
< 0.1%
Space Separator 2
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 626592
19.8%
2 588319
18.6%
3 458636
14.5%
4 260045
8.2%
5 235565
 
7.4%
0 220401
 
7.0%
6 202442
 
6.4%
9 198660
 
6.3%
7 192143
 
6.1%
8 183183
 
5.8%
Lowercase Letter
ValueCountFrequency (%)
a 4
40.0%
e 2
20.0%
z 1
 
10.0%
g 1
 
10.0%
l 1
 
10.0%
k 1
 
10.0%
Uppercase Letter
ValueCountFrequency (%)
L 2
50.0%
P 1
25.0%
E 1
25.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3165988
> 99.9%
Latin 14
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
1 626592
19.8%
2 588319
18.6%
3 458636
14.5%
4 260045
8.2%
5 235565
 
7.4%
0 220401
 
7.0%
6 202442
 
6.4%
9 198660
 
6.3%
7 192143
 
6.1%
8 183183
 
5.8%
Latin
ValueCountFrequency (%)
a 4
28.6%
e 2
14.3%
L 2
14.3%
P 1
 
7.1%
z 1
 
7.1%
E 1
 
7.1%
g 1
 
7.1%
l 1
 
7.1%
k 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3166002
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 626592
19.8%
2 588319
18.6%
3 458636
14.5%
4 260045
8.2%
5 235565
 
7.4%
0 220401
 
7.0%
6 202442
 
6.4%
9 198660
 
6.3%
7 192143
 
6.1%
8 183183
 
5.8%
Other values (10) 16
 
< 0.1%

year
Text

Missing 

Distinct208
Distinct (%)< 0.1%
Missing684432
Missing (%)35.5%
Memory size14.7 MiB
2025-01-14T11:47:58.267456image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters4966516
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st row1976
2nd row1984
3rd row1964
4th row1883
5th row1909
ValueCountFrequency (%)
1977 73833
 
5.9%
1981 43888
 
3.5%
1976 42215
 
3.4%
1982 38215
 
3.1%
1984 38199
 
3.1%
1908 35404
 
2.9%
1983 34028
 
2.7%
1985 30489
 
2.5%
1964 28252
 
2.3%
1975 25217
 
2.0%
Other values (198) 851889
68.6%
2025-01-14T11:47:58.511987image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1365971
27.5%
9 1248710
25.1%
8 526126
 
10.6%
7 429937
 
8.7%
6 323883
 
6.5%
0 306308
 
6.2%
2 220325
 
4.4%
5 194998
 
3.9%
4 178212
 
3.6%
3 172046
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4966516
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1365971
27.5%
9 1248710
25.1%
8 526126
 
10.6%
7 429937
 
8.7%
6 323883
 
6.5%
0 306308
 
6.2%
2 220325
 
4.4%
5 194998
 
3.9%
4 178212
 
3.6%
3 172046
 
3.5%

Most occurring scripts

ValueCountFrequency (%)
Common 4966516
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1365971
27.5%
9 1248710
25.1%
8 526126
 
10.6%
7 429937
 
8.7%
6 323883
 
6.5%
0 306308
 
6.2%
2 220325
 
4.4%
5 194998
 
3.9%
4 178212
 
3.6%
3 172046
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4966516
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1365971
27.5%
9 1248710
25.1%
8 526126
 
10.6%
7 429937
 
8.7%
6 323883
 
6.5%
0 306308
 
6.2%
2 220325
 
4.4%
5 194998
 
3.9%
4 178212
 
3.6%
3 172046
 
3.5%

month
Text

Missing 

Distinct12
Distinct (%)< 0.1%
Missing768070
Missing (%)39.9%
Memory size14.7 MiB
2025-01-14T11:47:58.583369image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length1
Mean length1.188973835
Min length1

Characters and Unicode

Total characters1376821
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row5
3rd row3
4th row8
5th row3
ValueCountFrequency (%)
8 133419
11.5%
5 128908
11.1%
7 123523
10.7%
6 108501
9.4%
4 100253
8.7%
11 97478
8.4%
2 97354
8.4%
3 89640
7.7%
9 87608
7.6%
1 69955
6.0%
Other values (2) 121352
10.5%
2025-01-14T11:47:58.694257image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 386263
28.1%
2 150267
 
10.9%
8 133419
 
9.7%
5 128908
 
9.4%
7 123523
 
9.0%
6 108501
 
7.9%
4 100253
 
7.3%
3 89640
 
6.5%
9 87608
 
6.4%
0 68439
 
5.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1376821
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 386263
28.1%
2 150267
 
10.9%
8 133419
 
9.7%
5 128908
 
9.4%
7 123523
 
9.0%
6 108501
 
7.9%
4 100253
 
7.3%
3 89640
 
6.5%
9 87608
 
6.4%
0 68439
 
5.0%

Most occurring scripts

ValueCountFrequency (%)
Common 1376821
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 386263
28.1%
2 150267
 
10.9%
8 133419
 
9.7%
5 128908
 
9.4%
7 123523
 
9.0%
6 108501
 
7.9%
4 100253
 
7.3%
3 89640
 
6.5%
9 87608
 
6.4%
0 68439
 
5.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1376821
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 386263
28.1%
2 150267
 
10.9%
8 133419
 
9.7%
5 128908
 
9.4%
7 123523
 
9.0%
6 108501
 
7.9%
4 100253
 
7.3%
3 89640
 
6.5%
9 87608
 
6.4%
0 68439
 
5.0%

day
Text

Missing 

Distinct31
Distinct (%)< 0.1%
Missing841840
Missing (%)43.7%
Memory size14.7 MiB
2025-01-14T11:47:58.774863image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length1.705713134
Min length1

Characters and Unicode

Total characters1849370
Distinct characters10
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row15
3rd row15
4th row31
5th row2
ValueCountFrequency (%)
10 46154
 
4.3%
13 45658
 
4.2%
19 44172
 
4.1%
6 40659
 
3.8%
21 40525
 
3.7%
15 38548
 
3.6%
8 38174
 
3.5%
9 38061
 
3.5%
18 36739
 
3.4%
14 36106
 
3.3%
Other values (21) 679425
62.7%
2025-01-14T11:47:58.904713image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 508569
27.5%
2 433588
23.4%
3 161568
 
8.7%
5 109545
 
5.9%
9 109355
 
5.9%
0 109327
 
5.9%
8 109113
 
5.9%
6 105547
 
5.7%
4 102072
 
5.5%
7 100686
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1849370
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 508569
27.5%
2 433588
23.4%
3 161568
 
8.7%
5 109545
 
5.9%
9 109355
 
5.9%
0 109327
 
5.9%
8 109113
 
5.9%
6 105547
 
5.7%
4 102072
 
5.5%
7 100686
 
5.4%

Most occurring scripts

ValueCountFrequency (%)
Common 1849370
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 508569
27.5%
2 433588
23.4%
3 161568
 
8.7%
5 109545
 
5.9%
9 109355
 
5.9%
0 109327
 
5.9%
8 109113
 
5.9%
6 105547
 
5.7%
4 102072
 
5.5%
7 100686
 
5.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1849370
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 508569
27.5%
2 433588
23.4%
3 161568
 
8.7%
5 109545
 
5.9%
9 109355
 
5.9%
0 109327
 
5.9%
8 109113
 
5.9%
6 105547
 
5.7%
4 102072
 
5.5%
7 100686
 
5.4%

verbatimEventDate
Text

Missing 

Distinct47773
Distinct (%)6.3%
Missing1172997
Missing (%)60.9%
Memory size14.7 MiB
2025-01-14T11:47:59.101603image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length181
Median length11
Mean length11.01792942
Min length1

Characters and Unicode

Total characters8297206
Distinct characters81
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15836 ?
Unique (%)2.1%

Sample

1st row-- --- ----
2nd row15 MAY 1984
3rd row15 MAR 1964
4th row03 MAR 1967
5th row31 AUG 1958
ValueCountFrequency (%)
275863
 
12.6%
may 68613
 
3.1%
aug 65838
 
3.0%
jul 61523
 
2.8%
apr 57927
 
2.6%
feb 53279
 
2.4%
jun 52775
 
2.4%
nov 52199
 
2.4%
mar 46116
 
2.1%
1977 42123
 
1.9%
Other values (8402) 1418761
64.6%
2025-01-14T11:47:59.385809image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1441953
17.4%
1 1077360
13.0%
9 807768
 
9.7%
- 749476
 
9.0%
2 340220
 
4.1%
7 334219
 
4.0%
0 322795
 
3.9%
8 301908
 
3.6%
6 296029
 
3.6%
A 274073
 
3.3%
Other values (71) 2351405
28.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 4020395
48.5%
Uppercase Letter 1821264
22.0%
Space Separator 1441953
 
17.4%
Dash Punctuation 749476
 
9.0%
Lowercase Letter 202093
 
2.4%
Other Punctuation 58036
 
0.7%
Close Punctuation 1858
 
< 0.1%
Open Punctuation 1855
 
< 0.1%
Connector Punctuation 187
 
< 0.1%
Math Symbol 89
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 23080
11.4%
r 22925
11.3%
l 19197
9.5%
n 18908
9.4%
i 17650
8.7%
a 15940
7.9%
t 14268
7.1%
p 13296
 
6.6%
g 11951
 
5.9%
u 11197
 
5.5%
Other values (15) 33681
16.7%
Uppercase Letter
ValueCountFrequency (%)
A 274073
15.0%
U 175994
 
9.7%
J 155768
 
8.6%
N 143274
 
7.9%
M 120553
 
6.6%
E 116114
 
6.4%
R 101232
 
5.6%
P 93207
 
5.1%
O 88359
 
4.9%
Y 68066
 
3.7%
Other values (14) 484624
26.6%
Other Punctuation
ValueCountFrequency (%)
. 19785
34.1%
/ 15745
27.1%
, 11251
19.4%
: 9405
16.2%
; 983
 
1.7%
? 319
 
0.5%
& 294
 
0.5%
' 244
 
0.4%
" 5
 
< 0.1%
\ 2
 
< 0.1%
Other values (2) 3
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 1077360
26.8%
9 807768
20.1%
2 340220
 
8.5%
7 334219
 
8.3%
0 322795
 
8.0%
8 301908
 
7.5%
6 296029
 
7.4%
3 203527
 
5.1%
5 177538
 
4.4%
4 159031
 
4.0%
Math Symbol
ValueCountFrequency (%)
+ 80
89.9%
~ 8
 
9.0%
< 1
 
1.1%
Close Punctuation
ValueCountFrequency (%)
) 1836
98.8%
] 22
 
1.2%
Open Punctuation
ValueCountFrequency (%)
( 1835
98.9%
[ 20
 
1.1%
Space Separator
ValueCountFrequency (%)
1441953
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 749476
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 187
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6273849
75.6%
Latin 2023357
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 274073
 
13.5%
U 175994
 
8.7%
J 155768
 
7.7%
N 143274
 
7.1%
M 120553
 
6.0%
E 116114
 
5.7%
R 101232
 
5.0%
P 93207
 
4.6%
O 88359
 
4.4%
Y 68066
 
3.4%
Other values (39) 686717
33.9%
Common
ValueCountFrequency (%)
1441953
23.0%
1 1077360
17.2%
9 807768
12.9%
- 749476
11.9%
2 340220
 
5.4%
7 334219
 
5.3%
0 322795
 
5.1%
8 301908
 
4.8%
6 296029
 
4.7%
3 203527
 
3.2%
Other values (22) 398594
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8297206
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1441953
17.4%
1 1077360
13.0%
9 807768
 
9.7%
- 749476
 
9.0%
2 340220
 
4.1%
7 334219
 
4.0%
0 322795
 
3.9%
8 301908
 
3.6%
6 296029
 
3.6%
A 274073
 
3.3%
Other values (71) 2351405
28.3%

habitat
Text

Missing 

Distinct18959
Distinct (%)27.4%
Missing1856817
Missing (%)96.4%
Memory size14.7 MiB
2025-01-14T11:47:59.588779image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length235
Median length159
Mean length19.79862515
Min length1

Characters and Unicode

Total characters1370936
Distinct characters89
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13598 ?
Unique (%)19.6%

Sample

1st rowBeach with fresh water creek running into it
2nd rowFreshwater
3rd rowIn sand
4th rowMangrove
5th rowUnder rocks
ValueCountFrequency (%)
freshwater 9206
 
4.1%
in 6886
 
3.1%
on 6372
 
2.8%
reef 6192
 
2.8%
sand 6091
 
2.7%
coral 5812
 
2.6%
of 4886
 
2.2%
rocks 4638
 
2.1%
sp 4290
 
1.9%
intertidal 4237
 
1.9%
Other values (6964) 165771
73.9%
2025-01-14T11:47:59.987905image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
155137
 
11.3%
e 134073
 
9.8%
a 117948
 
8.6%
r 101183
 
7.4%
n 83040
 
6.1%
s 82869
 
6.0%
o 79792
 
5.8%
t 71836
 
5.2%
i 60744
 
4.4%
l 60219
 
4.4%
Other values (79) 424095
30.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1121590
81.8%
Space Separator 155137
 
11.3%
Uppercase Letter 60784
 
4.4%
Other Punctuation 20717
 
1.5%
Decimal Number 6940
 
0.5%
Math Symbol 2493
 
0.2%
Dash Punctuation 1845
 
0.1%
Open Punctuation 717
 
0.1%
Close Punctuation 712
 
0.1%
Other Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 134073
12.0%
a 117948
10.5%
r 101183
 
9.0%
n 83040
 
7.4%
s 82869
 
7.4%
o 79792
 
7.1%
t 71836
 
6.4%
i 60744
 
5.4%
l 60219
 
5.4%
d 54652
 
4.9%
Other values (18) 275234
24.5%
Uppercase Letter
ValueCountFrequency (%)
F 12117
19.9%
L 6196
10.2%
S 6180
10.2%
I 5574
9.2%
R 4363
 
7.2%
O 3936
 
6.5%
M 3425
 
5.6%
C 3203
 
5.3%
U 2435
 
4.0%
B 2327
 
3.8%
Other values (16) 11028
18.1%
Other Punctuation
ValueCountFrequency (%)
, 10234
49.4%
. 7692
37.1%
; 838
 
4.0%
/ 686
 
3.3%
' 442
 
2.1%
# 299
 
1.4%
& 196
 
0.9%
: 111
 
0.5%
% 90
 
0.4%
" 69
 
0.3%
Other values (3) 60
 
0.3%
Decimal Number
ValueCountFrequency (%)
1 1216
17.5%
0 1156
16.7%
2 887
12.8%
5 749
10.8%
3 666
9.6%
4 598
8.6%
6 522
7.5%
8 389
 
5.6%
7 387
 
5.6%
9 370
 
5.3%
Math Symbol
ValueCountFrequency (%)
+ 2456
98.5%
= 24
 
1.0%
< 7
 
0.3%
~ 4
 
0.2%
> 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 714
99.6%
[ 3
 
0.4%
Close Punctuation
ValueCountFrequency (%)
) 710
99.7%
] 2
 
0.3%
Space Separator
ValueCountFrequency (%)
155137
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1845
100.0%
Other Symbol
ValueCountFrequency (%)
° 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1182374
86.2%
Common 188562
 
13.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 134073
 
11.3%
a 117948
 
10.0%
r 101183
 
8.6%
n 83040
 
7.0%
s 82869
 
7.0%
o 79792
 
6.7%
t 71836
 
6.1%
i 60744
 
5.1%
l 60219
 
5.1%
d 54652
 
4.6%
Other values (44) 336018
28.4%
Common
ValueCountFrequency (%)
155137
82.3%
, 10234
 
5.4%
. 7692
 
4.1%
+ 2456
 
1.3%
- 1845
 
1.0%
1 1216
 
0.6%
0 1156
 
0.6%
2 887
 
0.5%
; 838
 
0.4%
5 749
 
0.4%
Other values (25) 6352
 
3.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1370933
> 99.9%
None 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
155137
 
11.3%
e 134073
 
9.8%
a 117948
 
8.6%
r 101183
 
7.4%
n 83040
 
6.1%
s 82869
 
6.0%
o 79792
 
5.8%
t 71836
 
5.2%
i 60744
 
4.4%
l 60219
 
4.4%
Other values (76) 424092
30.9%
None
ValueCountFrequency (%)
é 1
33.3%
° 1
33.3%
ç 1
33.3%

locationID
Text

Missing 

Distinct94697
Distinct (%)10.1%
Missing983901
Missing (%)51.1%
Memory size14.7 MiB
2025-01-14T11:48:00.209641image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23319
Median length146
Mean length4.468806784
Min length1

Characters and Unicode

Total characters4210331
Distinct characters94
Distinct categories12 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique52902 ?
Unique (%)5.6%

Sample

1st rowE4
2nd rowNR 12-4 ID 101
3rd row23
4th row1002
5th row2059
ValueCountFrequency (%)
not 12390
 
1.2%
rec 12068
 
1.2%
4 8477
 
0.8%
rhb 7694
 
0.7%
rfb 7622
 
0.7%
1 7590
 
0.7%
2 6237
 
0.6%
3 5500
 
0.5%
gs 5167
 
0.5%
6 5009
 
0.5%
Other values (80977) 965185
92.5%
2025-01-14T11:48:00.506576image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 474193
 
11.3%
2 393866
 
9.4%
0 331627
 
7.9%
5 295816
 
7.0%
3 287513
 
6.8%
4 264033
 
6.3%
- 262213
 
6.2%
6 216530
 
5.1%
7 190846
 
4.5%
8 180812
 
4.3%
Other values (84) 1312882
31.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2800702
66.5%
Uppercase Letter 880118
 
20.9%
Dash Punctuation 262221
 
6.2%
Space Separator 99236
 
2.4%
Other Punctuation 76024
 
1.8%
Lowercase Letter 68179
 
1.6%
Control 8490
 
0.2%
Connector Punctuation 7888
 
0.2%
Close Punctuation 3371
 
0.1%
Open Punctuation 3364
 
0.1%
Other values (2) 738
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 92080
 
10.5%
S 79022
 
9.0%
C 71641
 
8.1%
B 66978
 
7.6%
R 60079
 
6.8%
M 56872
 
6.5%
N 52082
 
5.9%
E 48007
 
5.5%
I 44751
 
5.1%
T 36825
 
4.2%
Other values (17) 271781
30.9%
Lowercase Letter
ValueCountFrequency (%)
e 9934
14.6%
o 8028
11.8%
r 7686
11.3%
a 7336
10.8%
i 4282
 
6.3%
t 3986
 
5.8%
l 3681
 
5.4%
n 3059
 
4.5%
c 2995
 
4.4%
s 2614
 
3.8%
Other values (17) 14578
21.4%
Other Punctuation
ValueCountFrequency (%)
: 37458
49.3%
. 24829
32.7%
, 7382
 
9.7%
/ 3928
 
5.2%
# 1569
 
2.1%
& 290
 
0.4%
? 151
 
0.2%
; 133
 
0.2%
* 124
 
0.2%
' 119
 
0.2%
Other values (4) 41
 
0.1%
Decimal Number
ValueCountFrequency (%)
1 474193
16.9%
2 393866
14.1%
0 331627
11.8%
5 295816
10.6%
3 287513
10.3%
4 264033
9.4%
6 216530
7.7%
7 190846
6.8%
8 180812
 
6.5%
9 165466
 
5.9%
Close Punctuation
ValueCountFrequency (%)
) 3081
91.4%
] 288
 
8.5%
} 2
 
0.1%
Open Punctuation
ValueCountFrequency (%)
( 3074
91.4%
[ 288
 
8.6%
{ 2
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 262213
> 99.9%
8
 
< 0.1%
Control
ValueCountFrequency (%)
8445
99.5%
45
 
0.5%
Math Symbol
ValueCountFrequency (%)
+ 724
98.4%
= 12
 
1.6%
Other Number
ValueCountFrequency (%)
1
50.0%
1
50.0%
Space Separator
ValueCountFrequency (%)
99236
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 7888
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3262034
77.5%
Latin 948297
 
22.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 92080
 
9.7%
S 79022
 
8.3%
C 71641
 
7.6%
B 66978
 
7.1%
R 60079
 
6.3%
M 56872
 
6.0%
N 52082
 
5.5%
E 48007
 
5.1%
I 44751
 
4.7%
T 36825
 
3.9%
Other values (44) 339960
35.8%
Common
ValueCountFrequency (%)
1 474193
14.5%
2 393866
12.1%
0 331627
10.2%
5 295816
9.1%
3 287513
8.8%
4 264033
8.1%
- 262213
8.0%
6 216530
6.6%
7 190846
5.9%
8 180812
 
5.5%
Other values (30) 364585
11.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4210319
> 99.9%
Punctuation 8
 
< 0.1%
None 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 474193
 
11.3%
2 393866
 
9.4%
0 331627
 
7.9%
5 295816
 
7.0%
3 287513
 
6.8%
4 264033
 
6.3%
- 262213
 
6.2%
6 216530
 
5.1%
7 190846
 
4.5%
8 180812
 
4.3%
Other values (79) 1312870
31.2%
Punctuation
ValueCountFrequency (%)
8
100.0%
None
ValueCountFrequency (%)
ü 1
25.0%
1
25.0%
1
25.0%
É 1
25.0%

higherGeographyID
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:00.563745image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters7
Distinct characters6
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row24.1667
ValueCountFrequency (%)
24.1667 1
100.0%
2025-01-14T11:48:00.659009image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6 2
28.6%
2 1
14.3%
4 1
14.3%
. 1
14.3%
1 1
14.3%
7 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
85.7%
Other Punctuation 1
 
14.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
6 2
33.3%
2 1
16.7%
4 1
16.7%
1 1
16.7%
7 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
6 2
28.6%
2 1
14.3%
4 1
14.3%
. 1
14.3%
1 1
14.3%
7 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6 2
28.6%
2 1
14.3%
4 1
14.3%
. 1
14.3%
1 1
14.3%
7 1
14.3%

higherGeography
Text

Missing 

Distinct12371
Distinct (%)0.7%
Missing67820
Missing (%)3.5%
Memory size14.7 MiB
2025-01-14T11:48:00.858525image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length126
Median length104
Mean length36.17331336
Min length4

Characters and Unicode

Total characters67218734
Distinct characters83
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3191 ?
Unique (%)0.2%

Sample

1st rowNorth Atlantic Ocean, United States
2nd rowNorth Atlantic Ocean, Gulf of Mexico, United States, Florida
3rd rowNorth Atlantic Ocean, Caribbean Sea, Barbados
4th rowNorth Atlantic Ocean, Gulf of Mexico, United States, Florida
5th rowPhilippines
ValueCountFrequency (%)
ocean 1259680
 
13.4%
north 1097942
 
11.7%
united 886041
 
9.4%
states 871462
 
9.3%
atlantic 718171
 
7.7%
pacific 436930
 
4.7%
mexico 248318
 
2.6%
of 243317
 
2.6%
gulf 228723
 
2.4%
south 203297
 
2.2%
Other values (4653) 3190921
34.0%
2025-01-14T11:48:01.147471image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
7526561
 
11.2%
a 6864167
 
10.2%
t 6255693
 
9.3%
i 4779365
 
7.1%
e 4733121
 
7.0%
n 4583640
 
6.8%
c 3759705
 
5.6%
o 2896620
 
4.3%
, 2856796
 
4.2%
r 2271670
 
3.4%
Other values (73) 20691396
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 47714802
71.0%
Uppercase Letter 9109346
 
13.6%
Space Separator 7526561
 
11.2%
Other Punctuation 2866960
 
4.3%
Dash Punctuation 1038
 
< 0.1%
Close Punctuation 10
 
< 0.1%
Open Punctuation 10
 
< 0.1%
Decimal Number 6
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 6864167
14.4%
t 6255693
13.1%
i 4779365
10.0%
e 4733121
9.9%
n 4583640
9.6%
c 3759705
7.9%
o 2896620
 
6.1%
r 2271670
 
4.8%
s 2140631
 
4.5%
l 1954823
 
4.1%
Other values (28) 7475367
15.7%
Uppercase Letter
ValueCountFrequency (%)
S 1396876
15.3%
O 1301176
14.3%
N 1192784
13.1%
A 1063675
11.7%
U 893787
9.8%
P 682100
7.5%
C 555397
 
6.1%
M 514428
 
5.6%
G 305926
 
3.4%
F 216131
 
2.4%
Other values (17) 987066
10.8%
Other Punctuation
ValueCountFrequency (%)
, 2856796
99.6%
. 7748
 
0.3%
' 2246
 
0.1%
? 153
 
< 0.1%
& 11
 
< 0.1%
/ 6
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 2
33.3%
8 1
16.7%
3 1
16.7%
2 1
16.7%
0 1
16.7%
Close Punctuation
ValueCountFrequency (%)
) 8
80.0%
] 2
 
20.0%
Open Punctuation
ValueCountFrequency (%)
( 8
80.0%
[ 2
 
20.0%
Space Separator
ValueCountFrequency (%)
7526561
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1038
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 56824148
84.5%
Common 10394586
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 6864167
12.1%
t 6255693
 
11.0%
i 4779365
 
8.4%
e 4733121
 
8.3%
n 4583640
 
8.1%
c 3759705
 
6.6%
o 2896620
 
5.1%
r 2271670
 
4.0%
s 2140631
 
3.8%
l 1954823
 
3.4%
Other values (55) 16584713
29.2%
Common
ValueCountFrequency (%)
7526561
72.4%
, 2856796
 
27.5%
. 7748
 
0.1%
' 2246
 
< 0.1%
- 1038
 
< 0.1%
? 153
 
< 0.1%
& 11
 
< 0.1%
) 8
 
< 0.1%
( 8
 
< 0.1%
/ 6
 
< 0.1%
Other values (8) 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 67217783
> 99.9%
None 951
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
7526561
 
11.2%
a 6864167
 
10.2%
t 6255693
 
9.3%
i 4779365
 
7.1%
e 4733121
 
7.0%
n 4583640
 
6.8%
c 3759705
 
5.6%
o 2896620
 
4.3%
, 2856796
 
4.3%
r 2271670
 
3.4%
Other values (60) 20690445
30.8%
None
ValueCountFrequency (%)
ç 434
45.6%
í 144
 
15.1%
é 141
 
14.8%
ó 110
 
11.6%
á 100
 
10.5%
ê 7
 
0.7%
è 6
 
0.6%
ô 3
 
0.3%
ü 2
 
0.2%
Ñ 1
 
0.1%
Other values (3) 3
 
0.3%

continent
Text

Missing 

Distinct78
Distinct (%)< 0.1%
Missing585602
Missing (%)30.4%
Memory size14.7 MiB
2025-01-14T11:48:01.218040image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length62
Median length20
Mean length18.7466614
Min length4

Characters and Unicode

Total characters25129131
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)< 0.1%

Sample

1st rowNorth Atlantic Ocean
2nd rowNorth Atlantic Ocean
3rd rowNorth Atlantic Ocean
4th rowNorth Atlantic Ocean
5th rowAntarctic Ocean
ValueCountFrequency (%)
ocean 1259206
32.8%
north 1064954
27.7%
atlantic 718109
18.7%
pacific 436889
 
11.4%
south 160769
 
4.2%
america 74593
 
1.9%
indian 50190
 
1.3%
antarctic 43836
 
1.1%
arctic 10182
 
0.3%
asia 8415
 
0.2%
Other values (16) 13313
 
0.3%
2025-01-14T11:48:01.352540image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 3039386
12.1%
t 2763886
11.0%
a 2600398
10.3%
2499997
9.9%
n 2126760
8.5%
i 1784659
 
7.1%
e 1339358
 
5.3%
O 1259206
 
5.0%
o 1230891
 
4.9%
h 1225723
 
4.9%
Other values (23) 5258867
20.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18781428
74.7%
Uppercase Letter 3838823
 
15.3%
Space Separator 2499997
 
9.9%
Other Punctuation 8883
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 3039386
16.2%
t 2763886
14.7%
a 2600398
13.8%
n 2126760
11.3%
i 1784659
9.5%
e 1339358
7.1%
o 1230891
6.6%
h 1225723
6.5%
r 1205299
 
6.4%
l 721634
 
3.8%
Other values (10) 743434
 
4.0%
Uppercase Letter
ValueCountFrequency (%)
O 1259206
32.8%
N 1064952
27.7%
A 860108
22.4%
P 436889
 
11.4%
S 160772
 
4.2%
I 50190
 
1.3%
C 2777
 
0.1%
E 2768
 
0.1%
U 582
 
< 0.1%
L 579
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
, 8873
99.9%
? 10
 
0.1%
Space Separator
ValueCountFrequency (%)
2499997
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22620251
90.0%
Common 2508880
 
10.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 3039386
13.4%
t 2763886
12.2%
a 2600398
11.5%
n 2126760
9.4%
i 1784659
7.9%
e 1339358
 
5.9%
O 1259206
 
5.6%
o 1230891
 
5.4%
h 1225723
 
5.4%
r 1205299
 
5.3%
Other values (20) 4044685
17.9%
Common
ValueCountFrequency (%)
2499997
99.6%
, 8873
 
0.4%
? 10
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 25129131
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 3039386
12.1%
t 2763886
11.0%
a 2600398
10.3%
2499997
9.9%
n 2126760
8.5%
i 1784659
 
7.1%
e 1339358
 
5.3%
O 1259206
 
5.0%
o 1230891
 
4.9%
h 1225723
 
4.9%
Other values (23) 5258867
20.9%

waterBody
Text

Missing 

Distinct1655
Distinct (%)0.1%
Missing666547
Missing (%)34.6%
Memory size14.7 MiB
2025-01-14T11:48:01.538969image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length76
Median length75
Mean length24.49177619
Min length7

Characters and Unicode

Total characters30847735
Distinct characters63
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique510 ?
Unique (%)< 0.1%

Sample

1st rowNorth Atlantic Ocean
2nd rowNorth Atlantic Ocean, Gulf of Mexico
3rd rowNorth Atlantic Ocean, Caribbean Sea
4th rowNorth Atlantic Ocean, Gulf of Mexico
5th rowAntarctic Ocean
ValueCountFrequency (%)
ocean 1259206
26.1%
north 998360
20.7%
atlantic 718109
14.9%
pacific 436889
 
9.1%
of 231263
 
4.8%
gulf 228590
 
4.7%
sea 193861
 
4.0%
mexico 187715
 
3.9%
south 160358
 
3.3%
caribbean 89343
 
1.9%
Other values (1319) 317960
 
6.6%
2025-01-14T11:48:01.804671image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3562140
11.5%
c 3175333
10.3%
a 3112981
 
10.1%
t 2738433
 
8.9%
n 2331193
 
7.6%
i 2082373
 
6.8%
e 1823359
 
5.9%
o 1648016
 
5.3%
O 1260897
 
4.1%
r 1217914
 
3.9%
Other values (53) 7895096
25.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 22242939
72.1%
Uppercase Letter 4590559
 
14.9%
Space Separator 3562140
 
11.5%
Other Punctuation 451817
 
1.5%
Dash Punctuation 276
 
< 0.1%
Open Punctuation 2
 
< 0.1%
Close Punctuation 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 3175333
14.3%
a 3112981
14.0%
t 2738433
12.3%
n 2331193
10.5%
i 2082373
9.4%
e 1823359
8.2%
o 1648016
7.4%
r 1217914
 
5.5%
h 1180259
 
5.3%
l 988338
 
4.4%
Other values (20) 1944740
8.7%
Uppercase Letter
ValueCountFrequency (%)
O 1260897
27.5%
N 1000105
21.8%
A 784132
17.1%
P 450369
 
9.8%
S 386493
 
8.4%
G 231815
 
5.0%
M 210725
 
4.6%
C 120682
 
2.6%
B 53745
 
1.2%
I 51170
 
1.1%
Other values (15) 40426
 
0.9%
Other Punctuation
ValueCountFrequency (%)
, 451223
99.9%
. 464
 
0.1%
' 117
 
< 0.1%
? 13
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3562140
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 276
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 2
100.0%
Close Punctuation
ValueCountFrequency (%)
] 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 26833498
87.0%
Common 4014237
 
13.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 3175333
11.8%
a 3112981
11.6%
t 2738433
10.2%
n 2331193
 
8.7%
i 2082373
 
7.8%
e 1823359
 
6.8%
o 1648016
 
6.1%
O 1260897
 
4.7%
r 1217914
 
4.5%
h 1180259
 
4.4%
Other values (45) 6262740
23.3%
Common
ValueCountFrequency (%)
3562140
88.7%
, 451223
 
11.2%
. 464
 
< 0.1%
- 276
 
< 0.1%
' 117
 
< 0.1%
? 13
 
< 0.1%
[ 2
 
< 0.1%
] 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30847632
> 99.9%
None 103
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3562140
11.5%
c 3175333
10.3%
a 3112981
 
10.1%
t 2738433
 
8.9%
n 2331193
 
7.6%
i 2082373
 
6.8%
e 1823359
 
5.9%
o 1648016
 
5.3%
O 1260897
 
4.1%
r 1217914
 
3.9%
Other values (49) 7894993
25.6%
None
ValueCountFrequency (%)
í 48
46.6%
á 46
44.7%
ó 6
 
5.8%
è 3
 
2.9%

islandGroup
Text

Missing 

Distinct20
Distinct (%)2.6%
Missing1925291
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:01.875642image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length14.52857143
Min length5

Characters and Unicode

Total characters11187
Distinct characters35
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6 ?
Unique (%)0.8%

Sample

1st rowSociety Islands
2nd rowSociety Islands
3rd rowSociety Islands
4th rowSociety Islands
5th rowSociety Islands
ValueCountFrequency (%)
islands 707
47.0%
society 679
45.2%
exuma 20
 
1.3%
south 12
 
0.8%
sandwich 12
 
0.8%
florida 10
 
0.7%
keys 10
 
0.7%
pacific 10
 
0.7%
carolina 8
 
0.5%
aleutian 7
 
0.5%
Other values (14) 28
 
1.9%
2025-01-14T11:48:02.007601image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 1446
12.9%
a 803
 
7.2%
l 751
 
6.7%
n 748
 
6.7%
i 743
 
6.6%
d 738
 
6.6%
733
 
6.6%
o 722
 
6.5%
c 713
 
6.4%
e 711
 
6.4%
Other values (25) 3079
27.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8951
80.0%
Uppercase Letter 1503
 
13.4%
Space Separator 733
 
6.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 1446
16.2%
a 803
9.0%
l 751
8.4%
n 748
8.4%
i 743
8.3%
d 738
8.2%
o 722
8.1%
c 713
8.0%
e 711
7.9%
t 699
7.8%
Other values (11) 877
9.8%
Uppercase Letter
ValueCountFrequency (%)
I 710
47.2%
S 703
46.8%
E 21
 
1.4%
C 16
 
1.1%
P 12
 
0.8%
F 10
 
0.7%
K 10
 
0.7%
A 7
 
0.5%
M 6
 
0.4%
R 2
 
0.1%
Other values (3) 6
 
0.4%
Space Separator
ValueCountFrequency (%)
733
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 10454
93.4%
Common 733
 
6.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 1446
13.8%
a 803
 
7.7%
l 751
 
7.2%
n 748
 
7.2%
i 743
 
7.1%
d 738
 
7.1%
o 722
 
6.9%
c 713
 
6.8%
e 711
 
6.8%
I 710
 
6.8%
Other values (24) 2369
22.7%
Common
ValueCountFrequency (%)
733
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 11187
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 1446
12.9%
a 803
 
7.2%
l 751
 
6.7%
n 748
 
6.7%
i 743
 
6.6%
d 738
 
6.6%
733
 
6.6%
o 722
 
6.5%
c 713
 
6.4%
e 711
 
6.4%
Other values (25) 3079
27.5%

island
Text

Missing 

Distinct58
Distinct (%)5.9%
Missing1925083
Missing (%)99.9%
Memory size14.7 MiB
2025-01-14T11:48:02.106656image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length25
Median length6
Mean length6.676891616
Min length4

Characters and Unicode

Total characters6530
Distinct characters49
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)3.4%

Sample

1st rowMoorea
2nd rowMoorea
3rd rowShikoku
4th rowOahu
5th rowMoorea
ValueCountFrequency (%)
moorea 674
60.4%
oahu 147
 
13.2%
island 91
 
8.2%
great 20
 
1.8%
exuma 20
 
1.8%
nunivak 13
 
1.2%
eniwetok 13
 
1.2%
bonaire 11
 
1.0%
key 10
 
0.9%
west 10
 
0.9%
Other values (58) 106
 
9.5%
2025-01-14T11:48:02.264626image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 1430
21.9%
a 1060
16.2%
e 771
11.8%
r 737
11.3%
M 683
10.5%
u 225
 
3.4%
n 186
 
2.8%
h 170
 
2.6%
O 154
 
2.4%
137
 
2.1%
Other values (39) 977
15.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5279
80.8%
Uppercase Letter 1113
 
17.0%
Space Separator 137
 
2.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1430
27.1%
a 1060
20.1%
e 771
14.6%
r 737
14.0%
u 225
 
4.3%
n 186
 
3.5%
h 170
 
3.2%
s 121
 
2.3%
d 107
 
2.0%
l 105
 
2.0%
Other values (16) 367
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
M 683
61.4%
O 154
 
13.8%
I 90
 
8.1%
E 35
 
3.1%
G 23
 
2.1%
K 21
 
1.9%
N 19
 
1.7%
S 19
 
1.7%
B 17
 
1.5%
R 11
 
1.0%
Other values (11) 41
 
3.7%
Space Separator
ValueCountFrequency (%)
137
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 6392
97.9%
Common 138
 
2.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 1430
22.4%
a 1060
16.6%
e 771
12.1%
r 737
11.5%
M 683
10.7%
u 225
 
3.5%
n 186
 
2.9%
h 170
 
2.7%
O 154
 
2.4%
s 121
 
1.9%
Other values (37) 855
13.4%
Common
ValueCountFrequency (%)
137
99.3%
. 1
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6528
> 99.9%
None 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 1430
21.9%
a 1060
16.2%
e 771
11.8%
r 737
11.3%
M 683
10.5%
u 225
 
3.4%
n 186
 
2.8%
h 170
 
2.6%
O 154
 
2.4%
137
 
2.1%
Other values (38) 975
14.9%
None
ValueCountFrequency (%)
á 2
100.0%

country
Text

Missing 

Distinct353
Distinct (%)< 0.1%
Missing141874
Missing (%)7.4%
Memory size14.7 MiB
2025-01-14T11:48:02.461684image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length42
Mean length10.90559173
Min length4

Characters and Unicode

Total characters19457615
Distinct characters60
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique44 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowBarbados
4th rowUnited States
5th rowPhilippines
ValueCountFrequency (%)
united 886037
30.7%
states 871459
30.2%
philippines 93768
 
3.2%
mexico 58629
 
2.0%
islands 48870
 
1.7%
panama 46135
 
1.6%
antarctica 40202
 
1.4%
japan 38460
 
1.3%
cuba 30039
 
1.0%
new 28719
 
1.0%
Other values (297) 747880
25.9%
2025-01-14T11:48:02.741720image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 2859159
14.7%
e 2264812
11.6%
a 2126586
10.9%
i 1743725
9.0%
n 1525677
7.8%
s 1262431
 
6.5%
d 1113829
 
5.7%
1106011
 
5.7%
S 918175
 
4.7%
U 889383
 
4.6%
Other values (50) 3647827
18.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15479489
79.6%
Uppercase Letter 2868864
 
14.7%
Space Separator 1106011
 
5.7%
Other Punctuation 3203
 
< 0.1%
Dash Punctuation 48
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 2859159
18.5%
e 2264812
14.6%
a 2126586
13.7%
i 1743725
11.3%
n 1525677
9.9%
s 1262431
8.2%
d 1113829
 
7.2%
l 405876
 
2.6%
r 307232
 
2.0%
o 303838
 
2.0%
Other values (19) 1566324
10.1%
Uppercase Letter
ValueCountFrequency (%)
S 918175
32.0%
U 889383
31.0%
P 203327
 
7.1%
M 115539
 
4.0%
C 114327
 
4.0%
A 102386
 
3.6%
I 86019
 
3.0%
B 74941
 
2.6%
J 65993
 
2.3%
F 45409
 
1.6%
Other values (15) 253365
 
8.8%
Other Punctuation
ValueCountFrequency (%)
. 3089
96.4%
? 94
 
2.9%
, 18
 
0.6%
' 2
 
0.1%
Space Separator
ValueCountFrequency (%)
1106011
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 18348353
94.3%
Common 1109262
 
5.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 2859159
15.6%
e 2264812
12.3%
a 2126586
11.6%
i 1743725
9.5%
n 1525677
8.3%
s 1262431
 
6.9%
d 1113829
 
6.1%
S 918175
 
5.0%
U 889383
 
4.8%
l 405876
 
2.2%
Other values (44) 3238700
17.7%
Common
ValueCountFrequency (%)
1106011
99.7%
. 3089
 
0.3%
? 94
 
< 0.1%
- 48
 
< 0.1%
, 18
 
< 0.1%
' 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19457162
> 99.9%
None 453
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 2859159
14.7%
e 2264812
11.6%
a 2126586
10.9%
i 1743725
9.0%
n 1525677
7.8%
s 1262431
 
6.5%
d 1113829
 
5.7%
1106011
 
5.7%
S 918175
 
4.7%
U 889383
 
4.6%
Other values (47) 3647374
18.7%
None
ValueCountFrequency (%)
ç 433
95.6%
é 18
 
4.0%
ô 2
 
0.4%

countryCode
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:02.800601image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters10
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row24 10 00 N
ValueCountFrequency (%)
24 1
25.0%
10 1
25.0%
00 1
25.0%
n 1
25.0%
2025-01-14T11:48:02.905868image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
30.0%
0 3
30.0%
2 1
 
10.0%
4 1
 
10.0%
1 1
 
10.0%
N 1
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
60.0%
Space Separator 3
30.0%
Uppercase Letter 1
 
10.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 3
50.0%
2 1
 
16.7%
4 1
 
16.7%
1 1
 
16.7%
Space Separator
ValueCountFrequency (%)
3
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 9
90.0%
Latin 1
 
10.0%

Most frequent character per script

Common
ValueCountFrequency (%)
3
33.3%
0 3
33.3%
2 1
 
11.1%
4 1
 
11.1%
1 1
 
11.1%
Latin
ValueCountFrequency (%)
N 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3
30.0%
0 3
30.0%
2 1
 
10.0%
4 1
 
10.0%
1 1
 
10.0%
N 1
 
10.0%

stateProvince
Text

Missing 

Distinct1327
Distinct (%)0.1%
Missing943504
Missing (%)49.0%
Memory size14.7 MiB
2025-01-14T11:48:03.098108image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length51
Median length39
Mean length9.182602129
Min length3

Characters and Unicode

Total characters9022430
Distinct characters73
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique282 ?
Unique (%)< 0.1%

Sample

1st rowFlorida
2nd rowFlorida
3rd rowMassachusetts
4th rowQuezon
5th rowNewfoundland
ValueCountFrequency (%)
florida 157954
 
13.1%
massachusetts 103360
 
8.6%
california 57075
 
4.7%
carolina 53916
 
4.5%
texas 43585
 
3.6%
alaska 41853
 
3.5%
north 31985
 
2.7%
louisiana 28639
 
2.4%
hawaii 26395
 
2.2%
south 26207
 
2.2%
Other values (1254) 634930
52.7%
2025-01-14T11:48:03.366776image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1427698
15.8%
i 808867
 
9.0%
s 773106
 
8.6%
o 650775
 
7.2%
r 519350
 
5.8%
l 506572
 
5.6%
n 498591
 
5.5%
e 457545
 
5.1%
t 400549
 
4.4%
u 277270
 
3.1%
Other values (63) 2702107
29.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7610434
84.4%
Uppercase Letter 1183054
 
13.1%
Space Separator 223342
 
2.5%
Other Punctuation 5088
 
0.1%
Dash Punctuation 488
 
< 0.1%
Close Punctuation 8
 
< 0.1%
Open Punctuation 8
 
< 0.1%
Decimal Number 7
 
< 0.1%
Math Symbol 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1427698
18.8%
i 808867
10.6%
s 773106
10.2%
o 650775
8.6%
r 519350
 
6.8%
l 506572
 
6.7%
n 498591
 
6.6%
e 457545
 
6.0%
t 400549
 
5.3%
u 277270
 
3.6%
Other values (24) 1290111
17.0%
Uppercase Letter
ValueCountFrequency (%)
M 171112
14.5%
C 165183
14.0%
F 164672
13.9%
A 80857
 
6.8%
N 78765
 
6.7%
T 76122
 
6.4%
S 72411
 
6.1%
I 44680
 
3.8%
G 38390
 
3.2%
L 36079
 
3.0%
Other values (17) 254783
21.5%
Other Punctuation
ValueCountFrequency (%)
, 4592
90.3%
. 302
 
5.9%
' 148
 
2.9%
? 46
 
0.9%
Decimal Number
ValueCountFrequency (%)
1 3
42.9%
0 3
42.9%
7 1
 
14.3%
Space Separator
ValueCountFrequency (%)
223342
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 488
100.0%
Close Punctuation
ValueCountFrequency (%)
) 8
100.0%
Open Punctuation
ValueCountFrequency (%)
( 8
100.0%
Math Symbol
ValueCountFrequency (%)
| 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8793488
97.5%
Common 228942
 
2.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1427698
16.2%
i 808867
 
9.2%
s 773106
 
8.8%
o 650775
 
7.4%
r 519350
 
5.9%
l 506572
 
5.8%
n 498591
 
5.7%
e 457545
 
5.2%
t 400549
 
4.6%
u 277270
 
3.2%
Other values (51) 2473165
28.1%
Common
ValueCountFrequency (%)
223342
97.6%
, 4592
 
2.0%
- 488
 
0.2%
. 302
 
0.1%
' 148
 
0.1%
? 46
 
< 0.1%
) 8
 
< 0.1%
( 8
 
< 0.1%
1 3
 
< 0.1%
0 3
 
< 0.1%
Other values (2) 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9022046
> 99.9%
None 384
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1427698
15.8%
i 808867
 
9.0%
s 773106
 
8.6%
o 650775
 
7.2%
r 519350
 
5.8%
l 506572
 
5.6%
n 498591
 
5.5%
e 457545
 
5.1%
t 400549
 
4.4%
u 277270
 
3.1%
Other values (54) 2701723
29.9%
None
ValueCountFrequency (%)
é 123
32.0%
ó 101
26.3%
í 96
25.0%
á 52
13.5%
ê 7
 
1.8%
è 2
 
0.5%
Ñ 1
 
0.3%
ú 1
 
0.3%
ô 1
 
0.3%

county
Text

Missing 

Distinct2594
Distinct (%)1.9%
Missing1786110
Missing (%)92.7%
Memory size14.7 MiB
2025-01-14T11:48:03.566853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length46
Median length43
Mean length14.35967589
Min length3

Characters and Unicode

Total characters2009651
Distinct characters65
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique558 ?
Unique (%)0.4%

Sample

1st rowCumberland County
2nd rowAllamakee County
3rd rowSt. Lucie County
4th rowDelaware County
5th rowKimble County
ValueCountFrequency (%)
county 135403
45.4%
st 3893
 
1.3%
parish 3202
 
1.1%
monroe 3116
 
1.0%
lucie 2649
 
0.9%
montgomery 2553
 
0.9%
san 2117
 
0.7%
prince 1875
 
0.6%
george's 1763
 
0.6%
jackson 1747
 
0.6%
Other values (2256) 139854
46.9%
2025-01-14T11:48:03.831088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 223731
11.1%
o 216808
10.8%
t 181020
 
9.0%
u 160899
 
8.0%
158221
 
7.9%
C 152389
 
7.6%
y 151799
 
7.6%
e 105720
 
5.3%
a 103249
 
5.1%
r 74006
 
3.7%
Other values (55) 481809
24.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1546924
77.0%
Uppercase Letter 298370
 
14.8%
Space Separator 158221
 
7.9%
Other Punctuation 5911
 
0.3%
Dash Punctuation 225
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 223731
14.5%
o 216808
14.0%
t 181020
11.7%
u 160899
10.4%
y 151799
9.8%
e 105720
6.8%
a 103249
6.7%
r 74006
 
4.8%
i 55521
 
3.6%
l 50143
 
3.2%
Other values (22) 224028
14.5%
Uppercase Letter
ValueCountFrequency (%)
C 152389
51.1%
M 16354
 
5.5%
S 14112
 
4.7%
L 13052
 
4.4%
P 12733
 
4.3%
B 11991
 
4.0%
G 8959
 
3.0%
W 8632
 
2.9%
A 8278
 
2.8%
D 7831
 
2.6%
Other values (16) 44039
 
14.8%
Other Punctuation
ValueCountFrequency (%)
. 3891
65.8%
' 1979
33.5%
, 24
 
0.4%
& 11
 
0.2%
/ 6
 
0.1%
Space Separator
ValueCountFrequency (%)
158221
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 225
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1845294
91.8%
Common 164357
 
8.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 223731
12.1%
o 216808
11.7%
t 181020
9.8%
u 160899
 
8.7%
C 152389
 
8.3%
y 151799
 
8.2%
e 105720
 
5.7%
a 103249
 
5.6%
r 74006
 
4.0%
i 55521
 
3.0%
Other values (48) 420152
22.8%
Common
ValueCountFrequency (%)
158221
96.3%
. 3891
 
2.4%
' 1979
 
1.2%
- 225
 
0.1%
, 24
 
< 0.1%
& 11
 
< 0.1%
/ 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2009642
> 99.9%
None 9
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 223731
11.1%
o 216808
10.8%
t 181020
 
9.0%
u 160899
 
8.0%
158221
 
7.9%
C 152389
 
7.6%
y 151799
 
7.6%
e 105720
 
5.3%
a 103249
 
5.1%
r 74006
 
3.7%
Other values (49) 481800
24.0%
None
ValueCountFrequency (%)
ó 3
33.3%
ü 2
22.2%
ñ 1
 
11.1%
ç 1
 
11.1%
ø 1
 
11.1%
è 1
 
11.1%

locality
Text

Missing 

Distinct204716
Distinct (%)15.9%
Missing642266
Missing (%)33.3%
Memory size14.7 MiB
2025-01-14T11:48:04.066888image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13524
Median length378
Mean length28.98951702
Min length1

Characters and Unicode

Total characters37216597
Distinct characters139
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique126299 ?
Unique (%)9.8%

Sample

1st rowoff Delaware
2nd rowW Coast
3rd rowCape Sable, West Of
4th rowAntarctic Peninsula
5th rowGeorges Bank
ValueCountFrequency (%)
island 342298
 
5.6%
of 336380
 
5.5%
off 252624
 
4.1%
bay 137509
 
2.2%
islands 98135
 
1.6%
bank 84580
 
1.4%
south 74622
 
1.2%
georges 66648
 
1.1%
florida 63420
 
1.0%
river 63361
 
1.0%
Other values (77108) 4634261
75.3%
2025-01-14T11:48:04.377107image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4868751
 
13.1%
a 3497602
 
9.4%
e 2450745
 
6.6%
o 2296346
 
6.2%
n 2154517
 
5.8%
r 1674260
 
4.5%
s 1628596
 
4.4%
i 1597480
 
4.3%
l 1584073
 
4.3%
t 1475574
 
4.0%
Other values (129) 13988653
37.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25258393
67.9%
Uppercase Letter 5372235
 
14.4%
Space Separator 4868751
 
13.1%
Other Punctuation 1209015
 
3.2%
Decimal Number 423706
 
1.1%
Dash Punctuation 41185
 
0.1%
Open Punctuation 15167
 
< 0.1%
Close Punctuation 15038
 
< 0.1%
Control 7284
 
< 0.1%
Math Symbol 5035
 
< 0.1%
Other values (7) 788
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3497602
13.8%
e 2450745
9.7%
o 2296346
 
9.1%
n 2154517
 
8.5%
r 1674260
 
6.6%
s 1628596
 
6.4%
i 1597480
 
6.3%
l 1584073
 
6.3%
t 1475574
 
5.8%
d 1017847
 
4.0%
Other values (49) 5881353
23.3%
Uppercase Letter
ValueCountFrequency (%)
S 539946
 
10.1%
I 501583
 
9.3%
B 475968
 
8.9%
C 466565
 
8.7%
O 360035
 
6.7%
P 312753
 
5.8%
M 279542
 
5.2%
R 262521
 
4.9%
L 254760
 
4.7%
A 250943
 
4.7%
Other values (19) 1667619
31.0%
Other Punctuation
ValueCountFrequency (%)
, 986732
81.6%
. 147275
 
12.2%
' 31551
 
2.6%
; 24393
 
2.0%
/ 8160
 
0.7%
# 2752
 
0.2%
& 2520
 
0.2%
: 2297
 
0.2%
" 2103
 
0.2%
? 1193
 
0.1%
Other values (6) 39
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 82668
19.5%
0 70820
16.7%
2 57121
13.5%
5 49646
11.7%
3 39680
9.4%
4 31892
 
7.5%
6 30555
 
7.2%
7 22055
 
5.2%
8 20481
 
4.8%
9 18788
 
4.4%
Math Symbol
ValueCountFrequency (%)
+ 4129
82.0%
> 403
 
8.0%
= 375
 
7.4%
~ 121
 
2.4%
< 3
 
0.1%
| 2
 
< 0.1%
± 2
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 14360
94.7%
[ 789
 
5.2%
{ 18
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 14252
94.8%
] 776
 
5.2%
} 10
 
0.1%
Dash Punctuation
ValueCountFrequency (%)
- 41184
> 99.9%
1
 
< 0.1%
Control
ValueCountFrequency (%)
7245
99.5%
39
 
0.5%
Space Separator
ValueCountFrequency (%)
4868751
100.0%
Other Symbol
ValueCountFrequency (%)
° 762
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 14
100.0%
Modifier Symbol
ValueCountFrequency (%)
^ 6
100.0%
Modifier Letter
ValueCountFrequency (%)
ʻ 3
100.0%
Initial Punctuation
ValueCountFrequency (%)
1
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30630628
82.3%
Common 6585969
 
17.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3497602
 
11.4%
e 2450745
 
8.0%
o 2296346
 
7.5%
n 2154517
 
7.0%
r 1674260
 
5.5%
s 1628596
 
5.3%
i 1597480
 
5.2%
l 1584073
 
5.2%
t 1475574
 
4.8%
d 1017847
 
3.3%
Other values (78) 11253588
36.7%
Common
ValueCountFrequency (%)
4868751
73.9%
, 986732
 
15.0%
. 147275
 
2.2%
1 82668
 
1.3%
0 70820
 
1.1%
2 57121
 
0.9%
5 49646
 
0.8%
- 41184
 
0.6%
3 39680
 
0.6%
4 31892
 
0.5%
Other values (41) 210200
 
3.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 37214633
> 99.9%
None 1958
 
< 0.1%
Modifier Letters 3
 
< 0.1%
Punctuation 3
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4868751
 
13.1%
a 3497602
 
9.4%
e 2450745
 
6.6%
o 2296346
 
6.2%
n 2154517
 
5.8%
r 1674260
 
4.5%
s 1628596
 
4.4%
i 1597480
 
4.3%
l 1584073
 
4.3%
t 1475574
 
4.0%
Other values (86) 13986689
37.6%
None
ValueCountFrequency (%)
° 762
38.9%
é 230
 
11.7%
ã 187
 
9.6%
á 141
 
7.2%
ó 138
 
7.0%
í 109
 
5.6%
ñ 78
 
4.0%
ú 55
 
2.8%
ç 36
 
1.8%
ī 36
 
1.8%
Other values (29) 186
 
9.5%
Modifier Letters
ValueCountFrequency (%)
ʻ 3
100.0%
Punctuation
ValueCountFrequency (%)
1
33.3%
1
33.3%
1
33.3%
Distinct1038
Distinct (%)15.3%
Missing1919257
Missing (%)99.6%
Memory size14.7 MiB
2025-01-14T11:48:04.583768image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.359347443
Min length3

Characters and Unicode

Total characters36465
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique394 ?
Unique (%)5.8%

Sample

1st row783.0
2nd row15.0
3rd row135.0
4th row4070.0
5th row870.0
ValueCountFrequency (%)
1981.0 618
 
9.1%
135.0 196
 
2.9%
350.0 165
 
2.4%
348.0 125
 
1.8%
164.0 123
 
1.8%
149.0 117
 
1.7%
309.0 116
 
1.7%
388.0 85
 
1.2%
988.0 82
 
1.2%
1100.0 72
 
1.1%
Other values (1028) 5105
75.0%
2025-01-14T11:48:04.859710image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 9229
25.3%
. 6804
18.7%
1 4981
13.7%
2 2401
 
6.6%
8 2371
 
6.5%
3 2286
 
6.3%
9 2000
 
5.5%
5 1844
 
5.1%
4 1684
 
4.6%
7 1453
 
4.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 29661
81.3%
Other Punctuation 6804
 
18.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 9229
31.1%
1 4981
16.8%
2 2401
 
8.1%
8 2371
 
8.0%
3 2286
 
7.7%
9 2000
 
6.7%
5 1844
 
6.2%
4 1684
 
5.7%
7 1453
 
4.9%
6 1412
 
4.8%
Other Punctuation
ValueCountFrequency (%)
. 6804
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 36465
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 9229
25.3%
. 6804
18.7%
1 4981
13.7%
2 2401
 
6.6%
8 2371
 
6.5%
3 2286
 
6.3%
9 2000
 
5.5%
5 1844
 
5.1%
4 1684
 
4.6%
7 1453
 
4.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 36465
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 9229
25.3%
. 6804
18.7%
1 4981
13.7%
2 2401
 
6.6%
8 2371
 
6.5%
3 2286
 
6.3%
9 2000
 
5.5%
5 1844
 
5.1%
4 1684
 
4.6%
7 1453
 
4.0%
Distinct725
Distinct (%)20.6%
Missing1922544
Missing (%)99.8%
Memory size14.7 MiB
2025-01-14T11:48:05.073319image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length7
Median length6
Mean length5.355416548
Min length3

Characters and Unicode

Total characters18835
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique261 ?
Unique (%)7.4%

Sample

1st row783.0
2nd row15.0
3rd row185.0
4th row870.0
5th row853.0
ValueCountFrequency (%)
185.0 198
 
5.6%
914.0 57
 
1.6%
1524.0 48
 
1.4%
1100.0 45
 
1.3%
610.0 40
 
1.1%
1219.0 37
 
1.1%
1829.0 34
 
1.0%
2.0 33
 
0.9%
1372.0 33
 
0.9%
65.0 32
 
0.9%
Other values (715) 2960
84.2%
2025-01-14T11:48:05.356326image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 5068
26.9%
. 3516
18.7%
1 2463
13.1%
2 1466
 
7.8%
5 1254
 
6.7%
3 983
 
5.2%
8 942
 
5.0%
6 851
 
4.5%
4 837
 
4.4%
7 736
 
3.9%
Other values (2) 719
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 15318
81.3%
Other Punctuation 3516
 
18.7%
Dash Punctuation 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 5068
33.1%
1 2463
16.1%
2 1466
 
9.6%
5 1254
 
8.2%
3 983
 
6.4%
8 942
 
6.1%
6 851
 
5.6%
4 837
 
5.5%
7 736
 
4.8%
9 718
 
4.7%
Other Punctuation
ValueCountFrequency (%)
. 3516
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 18835
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 5068
26.9%
. 3516
18.7%
1 2463
13.1%
2 1466
 
7.8%
5 1254
 
6.7%
3 983
 
5.2%
8 942
 
5.0%
6 851
 
4.5%
4 837
 
4.4%
7 736
 
3.9%
Other values (2) 719
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18835
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 5068
26.9%
. 3516
18.7%
1 2463
13.1%
2 1466
 
7.8%
5 1254
 
6.7%
3 983
 
5.2%
8 942
 
5.0%
6 851
 
4.5%
4 837
 
4.4%
7 736
 
3.9%
Other values (2) 719
 
3.8%

verbatimElevation
Text

Missing 

Distinct126
Distinct (%)27.3%
Missing1925599
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:05.525808image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length44
Median length4
Mean length10.17099567
Min length4

Characters and Unicode

Total characters4699
Distinct characters51
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique65 ?
Unique (%)14.1%

Sample

1st row7000
2nd row4070 m.a.s.l.
3rd row4200-4400
4th row2009 +/- 20.1 feet
5th row3000
ValueCountFrequency (%)
collected 53
 
5.6%
on 53
 
5.6%
and 51
 
5.4%
flat 50
 
5.3%
lagoon 50
 
5.3%
slope 50
 
5.3%
m 27
 
2.8%
3800 23
 
2.4%
2550 21
 
2.2%
above 19
 
2.0%
Other values (148) 554
58.3%
2025-01-14T11:48:05.772323image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 660
14.0%
489
 
10.4%
l 346
 
7.4%
e 330
 
7.0%
o 320
 
6.8%
a 237
 
5.0%
3 219
 
4.7%
5 218
 
4.6%
t 202
 
4.3%
n 193
 
4.1%
Other values (41) 1485
31.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2418
51.5%
Decimal Number 1576
33.5%
Space Separator 489
 
10.4%
Other Punctuation 72
 
1.5%
Uppercase Letter 70
 
1.5%
Dash Punctuation 34
 
0.7%
Open Punctuation 17
 
0.4%
Close Punctuation 17
 
0.4%
Math Symbol 6
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 346
14.3%
e 330
13.6%
o 320
13.2%
a 237
9.8%
t 202
8.4%
n 193
8.0%
s 124
 
5.1%
d 118
 
4.9%
f 84
 
3.5%
m 80
 
3.3%
Other values (13) 384
15.9%
Decimal Number
ValueCountFrequency (%)
0 660
41.9%
3 219
 
13.9%
5 218
 
13.8%
2 127
 
8.1%
4 98
 
6.2%
8 74
 
4.7%
1 68
 
4.3%
7 48
 
3.0%
9 43
 
2.7%
6 21
 
1.3%
Other Punctuation
ValueCountFrequency (%)
. 40
55.6%
' 20
27.8%
? 6
 
8.3%
, 3
 
4.2%
/ 2
 
2.8%
; 1
 
1.4%
Uppercase Letter
ValueCountFrequency (%)
C 51
72.9%
E 15
 
21.4%
A 2
 
2.9%
I 1
 
1.4%
T 1
 
1.4%
Math Symbol
ValueCountFrequency (%)
~ 3
50.0%
+ 2
33.3%
> 1
 
16.7%
Space Separator
ValueCountFrequency (%)
489
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 34
100.0%
Open Punctuation
ValueCountFrequency (%)
( 17
100.0%
Close Punctuation
ValueCountFrequency (%)
) 17
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2488
52.9%
Common 2211
47.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 346
13.9%
e 330
13.3%
o 320
12.9%
a 237
9.5%
t 202
8.1%
n 193
7.8%
s 124
 
5.0%
d 118
 
4.7%
f 84
 
3.4%
m 80
 
3.2%
Other values (18) 454
18.2%
Common
ValueCountFrequency (%)
0 660
29.9%
489
22.1%
3 219
 
9.9%
5 218
 
9.9%
2 127
 
5.7%
4 98
 
4.4%
8 74
 
3.3%
1 68
 
3.1%
7 48
 
2.2%
9 43
 
1.9%
Other values (13) 167
 
7.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 4699
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 660
14.0%
489
 
10.4%
l 346
 
7.4%
e 330
 
7.0%
o 320
 
6.8%
a 237
 
5.0%
3 219
 
4.7%
5 218
 
4.6%
t 202
 
4.3%
n 193
 
4.1%
Other values (41) 1485
31.6%

verticalDatum
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:05.828263image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters3
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row152
ValueCountFrequency (%)
152 1
100.0%
2025-01-14T11:48:05.927334image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 1
33.3%
5 1
33.3%
2 1
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 3
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1
33.3%
5 1
33.3%
2 1
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 3
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 1
33.3%
5 1
33.3%
2 1
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 1
33.3%
5 1
33.3%
2 1
33.3%

minimumDepthInMeters
Text

Missing 

Distinct6902
Distinct (%)0.9%
Missing1143588
Missing (%)59.4%
Memory size14.7 MiB
2025-01-14T11:48:06.126477image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.378034769
Min length3

Characters and Unicode

Total characters3425694
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2028 ?
Unique (%)0.3%

Sample

1st row77.0
2nd row50.0
3rd row74.0
4th row265.0
5th row75.0
ValueCountFrequency (%)
0.0 45226
 
5.8%
1.0 16089
 
2.1%
18.0 10809
 
1.4%
2.0 9889
 
1.3%
15.0 9294
 
1.2%
84.0 9270
 
1.2%
82.0 8938
 
1.1%
3.0 8672
 
1.1%
27.0 8646
 
1.1%
55.0 8479
 
1.1%
Other values (6887) 647161
82.7%
2025-01-14T11:48:06.407710image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 981792
28.7%
. 782472
22.8%
1 321150
 
9.4%
2 239243
 
7.0%
5 194428
 
5.7%
3 185474
 
5.4%
4 175086
 
5.1%
8 152567
 
4.5%
6 145116
 
4.2%
7 129322
 
3.8%
Other values (2) 119044
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2643180
77.2%
Other Punctuation 782472
 
22.8%
Dash Punctuation 42
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 981792
37.1%
1 321150
 
12.2%
2 239243
 
9.1%
5 194428
 
7.4%
3 185474
 
7.0%
4 175086
 
6.6%
8 152567
 
5.8%
6 145116
 
5.5%
7 129322
 
4.9%
9 119002
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 782472
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3425694
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 981792
28.7%
. 782472
22.8%
1 321150
 
9.4%
2 239243
 
7.0%
5 194428
 
5.7%
3 185474
 
5.4%
4 175086
 
5.1%
8 152567
 
4.5%
6 145116
 
4.2%
7 129322
 
3.8%
Other values (2) 119044
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3425694
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 981792
28.7%
. 782472
22.8%
1 321150
 
9.4%
2 239243
 
7.0%
5 194428
 
5.7%
3 185474
 
5.4%
4 175086
 
5.1%
8 152567
 
4.5%
6 145116
 
4.2%
7 129322
 
3.8%
Other values (2) 119044
 
3.5%

maximumDepthInMeters
Text

Missing 

Distinct6653
Distinct (%)0.9%
Missing1205034
Missing (%)62.6%
Memory size14.7 MiB
2025-01-14T11:48:06.841470image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length7
Mean length4.453148079
Min length3

Characters and Unicode

Total characters3210840
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1921 ?
Unique (%)0.3%

Sample

1st row77.0
2nd row400.0
3rd row74.0
4th row265.0
5th row75.0
ValueCountFrequency (%)
1.0 27351
 
3.8%
2.0 10561
 
1.5%
18.0 9723
 
1.3%
84.0 9131
 
1.3%
3.0 8394
 
1.2%
55.0 7481
 
1.0%
27.0 7196
 
1.0%
37.0 6993
 
1.0%
5.0 6769
 
0.9%
0.0 6747
 
0.9%
Other values (6638) 620681
86.1%
2025-01-14T11:48:07.117707image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 885387
27.6%
. 721026
22.5%
1 323911
 
10.1%
2 234786
 
7.3%
5 184685
 
5.8%
3 176891
 
5.5%
4 166718
 
5.2%
8 143338
 
4.5%
6 138529
 
4.3%
7 123043
 
3.8%
Other values (2) 112526
 
3.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2489772
77.5%
Other Punctuation 721026
 
22.5%
Dash Punctuation 42
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 885387
35.6%
1 323911
 
13.0%
2 234786
 
9.4%
5 184685
 
7.4%
3 176891
 
7.1%
4 166718
 
6.7%
8 143338
 
5.8%
6 138529
 
5.6%
7 123043
 
4.9%
9 112484
 
4.5%
Other Punctuation
ValueCountFrequency (%)
. 721026
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 42
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 3210840
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 885387
27.6%
. 721026
22.5%
1 323911
 
10.1%
2 234786
 
7.3%
5 184685
 
5.8%
3 176891
 
5.5%
4 166718
 
5.2%
8 143338
 
4.5%
6 138529
 
4.3%
7 123043
 
3.8%
Other values (2) 112526
 
3.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3210840
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 885387
27.6%
. 721026
22.5%
1 323911
 
10.1%
2 234786
 
7.3%
5 184685
 
5.8%
3 176891
 
5.5%
4 166718
 
5.2%
8 143338
 
4.5%
6 138529
 
4.3%
7 123043
 
3.8%
Other values (2) 112526
 
3.5%

verbatimDepth
Text

Missing 

Distinct1531
Distinct (%)5.8%
Missing1899821
Missing (%)98.6%
Memory size14.7 MiB
2025-01-14T11:48:07.321690image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length99
Median length91
Mean length13.4351753
Min length1

Characters and Unicode

Total characters352539
Distinct characters79
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique722 ?
Unique (%)2.8%

Sample

1st rowSurface
2nd rowmax depth 1772 ft
3rd rowsurface
4th rowIntertidal
5th rowIntertidal
ValueCountFrequency (%)
intertidal 11930
23.4%
surface 4084
 
8.0%
recorded 2869
 
5.6%
depths 2848
 
5.6%
multiple 2844
 
5.6%
shore 1165
 
2.3%
0-300 1120
 
2.2%
0 1067
 
2.1%
depth 1023
 
2.0%
low 964
 
1.9%
Other values (1043) 21001
41.2%
2025-01-14T11:48:07.588953image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 36679
 
10.4%
e 35131
 
10.0%
r 25384
 
7.2%
24675
 
7.0%
d 24169
 
6.9%
l 20645
 
5.9%
a 20478
 
5.8%
i 19388
 
5.5%
0 16025
 
4.5%
n 14725
 
4.2%
Other values (69) 115240
32.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 250835
71.2%
Decimal Number 39286
 
11.1%
Space Separator 24675
 
7.0%
Uppercase Letter 19955
 
5.7%
Other Punctuation 12435
 
3.5%
Dash Punctuation 4880
 
1.4%
Math Symbol 236
 
0.1%
Open Punctuation 118
 
< 0.1%
Close Punctuation 118
 
< 0.1%
Final Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 36679
14.6%
e 35131
14.0%
r 25384
10.1%
d 24169
9.6%
l 20645
8.2%
a 20478
8.2%
i 19388
7.7%
n 14725
 
5.9%
c 8173
 
3.3%
p 7641
 
3.0%
Other values (15) 38422
15.3%
Uppercase Letter
ValueCountFrequency (%)
I 10832
54.3%
S 4488
22.5%
M 2984
 
15.0%
L 758
 
3.8%
T 218
 
1.1%
B 109
 
0.5%
H 83
 
0.4%
D 78
 
0.4%
C 73
 
0.4%
Z 59
 
0.3%
Other values (14) 273
 
1.4%
Other Punctuation
ValueCountFrequency (%)
, 5996
48.2%
: 3686
29.6%
. 1398
 
11.2%
" 841
 
6.8%
; 207
 
1.7%
' 201
 
1.6%
@ 43
 
0.3%
/ 29
 
0.2%
& 22
 
0.2%
? 10
 
0.1%
Decimal Number
ValueCountFrequency (%)
0 16025
40.8%
1 4889
 
12.4%
2 3728
 
9.5%
3 3378
 
8.6%
5 2938
 
7.5%
8 2555
 
6.5%
4 1746
 
4.4%
6 1719
 
4.4%
7 1431
 
3.6%
9 877
 
2.2%
Math Symbol
ValueCountFrequency (%)
< 138
58.5%
= 60
25.4%
+ 24
 
10.2%
~ 14
 
5.9%
Space Separator
ValueCountFrequency (%)
24675
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4880
100.0%
Open Punctuation
ValueCountFrequency (%)
( 118
100.0%
Close Punctuation
ValueCountFrequency (%)
) 118
100.0%
Final Punctuation
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 270790
76.8%
Common 81749
 
23.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 36679
13.5%
e 35131
13.0%
r 25384
9.4%
d 24169
8.9%
l 20645
 
7.6%
a 20478
 
7.6%
i 19388
 
7.2%
n 14725
 
5.4%
I 10832
 
4.0%
c 8173
 
3.0%
Other values (39) 55186
20.4%
Common
ValueCountFrequency (%)
24675
30.2%
0 16025
19.6%
, 5996
 
7.3%
1 4889
 
6.0%
- 4880
 
6.0%
2 3728
 
4.6%
: 3686
 
4.5%
3 3378
 
4.1%
5 2938
 
3.6%
8 2555
 
3.1%
Other values (20) 8999
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 352538
> 99.9%
Punctuation 1
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 36679
 
10.4%
e 35131
 
10.0%
r 25384
 
7.2%
24675
 
7.0%
d 24169
 
6.9%
l 20645
 
5.9%
a 20478
 
5.8%
i 19388
 
5.5%
0 16025
 
4.5%
n 14725
 
4.2%
Other values (68) 115239
32.7%
Punctuation
ValueCountFrequency (%)
1
100.0%

decimalLatitude
Text

Missing 

Distinct70077
Distinct (%)7.0%
Missing927243
Missing (%)48.1%
Memory size14.7 MiB
2025-01-14T11:48:07.812484image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length7
Mean length6.23583776
Min length3

Characters and Unicode

Total characters6228467
Distinct characters13
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique26221 ?
Unique (%)2.6%

Sample

1st row38.7117
2nd row25.2819
3rd row-62.667
4th row42.0833
5th row13.7792
ValueCountFrequency (%)
25.58 10487
 
1.0%
40.6583 8820
 
0.9%
26.17 7319
 
0.7%
26.5 5191
 
0.5%
26.97 3956
 
0.4%
25.7883 3456
 
0.3%
9.4 3108
 
0.3%
9.37 2976
 
0.3%
40.895 2589
 
0.3%
40.66 2520
 
0.3%
Other values (65547) 948396
95.0%
2025-01-14T11:48:08.100016image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 998818
16.0%
3 788063
12.7%
2 616041
9.9%
5 525246
8.4%
7 524886
8.4%
4 501449
8.1%
1 480642
7.7%
6 474819
7.6%
8 472165
7.6%
9 377071
 
6.1%
Other values (3) 469267
7.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5077658
81.5%
Other Punctuation 998818
 
16.0%
Dash Punctuation 151990
 
2.4%
Uppercase Letter 1
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
3 788063
15.5%
2 616041
12.1%
5 525246
10.3%
7 524886
10.3%
4 501449
9.9%
1 480642
9.5%
6 474819
9.4%
8 472165
9.3%
9 377071
7.4%
0 317276
6.2%
Other Punctuation
ValueCountFrequency (%)
. 998818
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 151990
100.0%
Uppercase Letter
ValueCountFrequency (%)
E 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 6228466
> 99.9%
Latin 1
 
< 0.1%

Most frequent character per script

Common
ValueCountFrequency (%)
. 998818
16.0%
3 788063
12.7%
2 616041
9.9%
5 525246
8.4%
7 524886
8.4%
4 501449
8.1%
1 480642
7.7%
6 474819
7.6%
8 472165
7.6%
9 377071
 
6.1%
Other values (2) 469266
7.5%
Latin
ValueCountFrequency (%)
E 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6228467
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 998818
16.0%
3 788063
12.7%
2 616041
9.9%
5 525246
8.4%
7 524886
8.4%
4 501449
8.1%
1 480642
7.7%
6 474819
7.6%
8 472165
7.6%
9 377071
 
6.1%
Other values (3) 469267
7.5%

decimalLongitude
Text

Missing 

Distinct74667
Distinct (%)7.5%
Missing927246
Missing (%)48.1%
Memory size14.7 MiB
2025-01-14T11:48:08.335437image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length8
Mean length7.110719202
Min length3

Characters and Unicode

Total characters7102293
Distinct characters12
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique27319 ?
Unique (%)2.7%

Sample

1st row-73.405
2nd row-83.6297
3rd row-54.742
4th row-66.7708
5th row121.586
ValueCountFrequency (%)
80.1 10527
 
1.1%
127.848 4531
 
0.5%
67.7683 4212
 
0.4%
80.13 3737
 
0.4%
82.7 3517
 
0.4%
67.77 2820
 
0.3%
66.775 2591
 
0.3%
81.6633 2462
 
0.2%
70.6731 2397
 
0.2%
67.755 2355
 
0.2%
Other values (69814) 959666
96.1%
2025-01-14T11:48:08.631591image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
. 998815
14.1%
- 826053
11.6%
7 744517
10.5%
8 682555
9.6%
1 674556
9.5%
6 575271
8.1%
3 562177
7.9%
2 472510
6.7%
5 433016
6.1%
9 409778
5.8%
Other values (2) 723045
10.2%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5277425
74.3%
Other Punctuation 998815
 
14.1%
Dash Punctuation 826053
 
11.6%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 744517
14.1%
8 682555
12.9%
1 674556
12.8%
6 575271
10.9%
3 562177
10.7%
2 472510
9.0%
5 433016
8.2%
9 409778
7.8%
0 371185
7.0%
4 351860
6.7%
Other Punctuation
ValueCountFrequency (%)
. 998815
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 826053
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 7102293
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
. 998815
14.1%
- 826053
11.6%
7 744517
10.5%
8 682555
9.6%
1 674556
9.5%
6 575271
8.1%
3 562177
7.9%
2 472510
6.7%
5 433016
6.1%
9 409778
5.8%
Other values (2) 723045
10.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7102293
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
. 998815
14.1%
- 826053
11.6%
7 744517
10.5%
8 682555
9.6%
1 674556
9.5%
6 575271
8.1%
3 562177
7.9%
2 472510
6.7%
5 433016
6.1%
9 409778
5.8%
Other values (2) 723045
10.2%

geodeticDatum
Text

Missing 

Distinct5
Distinct (%)< 0.1%
Missing1858158
Missing (%)96.5%
Memory size14.7 MiB
2025-01-14T11:48:08.696626image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length5
Mean length5.17221625
Min length5

Characters and Unicode

Total characters351209
Distinct characters21
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowWGS84
2nd rowWGS84
3rd rowWGS84
4th rowWGS84
5th rowWGS84
ValueCountFrequency (%)
wgs84 67002
96.1%
wgs 896
 
1.3%
84 896
 
1.3%
epsg:4326 896
 
1.3%
nad83 3
 
< 0.1%
epsg:4269 3
 
< 0.1%
1936-08-14 1
 
< 0.1%
1926-08-24 1
 
< 0.1%
2025-01-14T11:48:08.813104image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 68799
19.6%
G 68797
19.6%
S 68797
19.6%
8 67903
19.3%
W 67898
19.3%
1795
 
0.5%
6 901
 
0.3%
2 901
 
0.3%
3 900
 
0.3%
: 899
 
0.3%
Other values (11) 3619
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 207299
59.0%
Decimal Number 139414
39.7%
Space Separator 1795
 
0.5%
Other Punctuation 899
 
0.3%
Open Punctuation 899
 
0.3%
Close Punctuation 899
 
0.3%
Dash Punctuation 4
 
< 0.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 68799
49.3%
8 67903
48.7%
6 901
 
0.6%
2 901
 
0.6%
3 900
 
0.6%
9 5
 
< 0.1%
1 3
 
< 0.1%
0 2
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
G 68797
33.2%
S 68797
33.2%
W 67898
32.8%
P 899
 
0.4%
E 899
 
0.4%
N 3
 
< 0.1%
A 3
 
< 0.1%
D 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1795
100.0%
Other Punctuation
ValueCountFrequency (%)
: 899
100.0%
Open Punctuation
ValueCountFrequency (%)
( 899
100.0%
Close Punctuation
ValueCountFrequency (%)
) 899
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 207299
59.0%
Common 143910
41.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 68799
47.8%
8 67903
47.2%
1795
 
1.2%
6 901
 
0.6%
2 901
 
0.6%
3 900
 
0.6%
: 899
 
0.6%
( 899
 
0.6%
) 899
 
0.6%
9 5
 
< 0.1%
Other values (3) 9
 
< 0.1%
Latin
ValueCountFrequency (%)
G 68797
33.2%
S 68797
33.2%
W 67898
32.8%
P 899
 
0.4%
E 899
 
0.4%
N 3
 
< 0.1%
A 3
 
< 0.1%
D 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 351209
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 68799
19.6%
G 68797
19.6%
S 68797
19.6%
8 67903
19.3%
W 67898
19.3%
1795
 
0.5%
6 901
 
0.3%
2 901
 
0.3%
3 900
 
0.3%
: 899
 
0.3%
Other values (11) 3619
 
1.0%

coordinatePrecision
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:08.861935image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row227
2nd row236
ValueCountFrequency (%)
227 1
50.0%
236 1
50.0%
2025-01-14T11:48:08.957835image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

pointRadiusSpatialFit
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:09.006673image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters6
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row227
2nd row236
ValueCountFrequency (%)
227 1
50.0%
236 1
50.0%
2025-01-14T11:48:09.107973image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring scripts

ValueCountFrequency (%)
Common 6
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 3
50.0%
7 1
 
16.7%
3 1
 
16.7%
6 1
 
16.7%

verbatimCoordinates
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:09.153100image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters8
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row1936
2nd row1926
ValueCountFrequency (%)
1936 1
50.0%
1926 1
50.0%
2025-01-14T11:48:09.251535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 8
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 2
25.0%
9 2
25.0%
6 2
25.0%
3 1
12.5%
2 1
12.5%

verbatimLatitude
Text

Missing 

Distinct13526
Distinct (%)18.9%
Missing1854408
Missing (%)96.3%
Memory size14.7 MiB
2025-01-14T11:48:09.443360image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length81176
Median length33373
Mean length13.09566941
Min length1

Characters and Unicode

Total characters938344
Distinct characters104
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6600 ?
Unique (%)9.2%

Sample

1st row12.083197
2nd row35 00.11 N
3rd row21.502905
4th row29 47.5 N
5th row36.4512
ValueCountFrequency (%)
n 42003
 
19.9%
29 8560
 
4.0%
s 7189
 
3.4%
28 6242
 
3.0%
27 6074
 
2.9%
00 3976
 
1.9%
26 2567
 
1.2%
24 2186
 
1.0%
23 1948
 
0.9%
42 1918
 
0.9%
Other values (13451) 128767
60.9%
2025-01-14T11:48:09.720799image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
128576
 
13.7%
2 75402
 
8.0%
62317
 
6.6%
0 55150
 
5.9%
3 51265
 
5.5%
N 50953
 
5.4%
4 50156
 
5.3%
. 47966
 
5.1%
1 45741
 
4.9%
5 44030
 
4.7%
Other values (94) 326788
34.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 467432
49.8%
Space Separator 128576
 
13.7%
Lowercase Letter 118832
 
12.7%
Uppercase Letter 78978
 
8.4%
Other Punctuation 65558
 
7.0%
Control 62650
 
6.7%
Dash Punctuation 9429
 
1.0%
Other Symbol 5148
 
0.5%
Other Letter 477
 
0.1%
Close Punctuation 372
 
< 0.1%
Other values (6) 892
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14870
12.5%
e 11139
9.4%
i 10564
 
8.9%
t 10203
 
8.6%
o 8641
 
7.3%
n 8286
 
7.0%
d 7287
 
6.1%
c 7250
 
6.1%
l 6889
 
5.8%
r 6805
 
5.7%
Other values (20) 26898
22.6%
Uppercase Letter
ValueCountFrequency (%)
N 50953
64.5%
S 10837
 
13.7%
M 2290
 
2.9%
A 2082
 
2.6%
L 1490
 
1.9%
U 1211
 
1.5%
P 1190
 
1.5%
O 973
 
1.2%
D 972
 
1.2%
E 948
 
1.2%
Other values (16) 6032
 
7.6%
Other Punctuation
ValueCountFrequency (%)
. 47966
73.2%
' 4497
 
6.9%
: 3498
 
5.3%
; 2874
 
4.4%
, 2694
 
4.1%
/ 2078
 
3.2%
" 1092
 
1.7%
398
 
0.6%
* 149
 
0.2%
? 125
 
0.2%
Other values (5) 187
 
0.3%
Decimal Number
ValueCountFrequency (%)
2 75402
16.1%
0 55150
11.8%
3 51265
11.0%
4 50156
10.7%
1 45741
9.8%
5 44030
9.4%
9 39569
8.5%
7 37449
8.0%
8 36950
7.9%
6 31720
6.8%
Close Punctuation
ValueCountFrequency (%)
) 361
97.0%
} 10
 
2.7%
] 1
 
0.3%
Open Punctuation
ValueCountFrequency (%)
( 361
97.0%
{ 10
 
2.7%
[ 1
 
0.3%
Math Symbol
ValueCountFrequency (%)
= 57
80.3%
~ 13
 
18.3%
+ 1
 
1.4%
Control
ValueCountFrequency (%)
62317
99.5%
333
 
0.5%
Dash Punctuation
ValueCountFrequency (%)
- 9426
> 99.9%
3
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 5146
> 99.9%
2
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
´ 82
85.4%
˚ 14
 
14.6%
Modifier Letter
ValueCountFrequency (%)
ʹ 26
96.3%
ʺ 1
 
3.7%
Space Separator
ValueCountFrequency (%)
128576
100.0%
Other Letter
ValueCountFrequency (%)
º 477
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 227
100.0%
Final Punctuation
ValueCountFrequency (%)
99
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 740057
78.9%
Latin 198287
 
21.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
N 50953
25.7%
a 14870
 
7.5%
e 11139
 
5.6%
S 10837
 
5.5%
i 10564
 
5.3%
t 10203
 
5.1%
o 8641
 
4.4%
n 8286
 
4.2%
d 7287
 
3.7%
c 7250
 
3.7%
Other values (47) 58257
29.4%
Common
ValueCountFrequency (%)
128576
17.4%
2 75402
10.2%
62317
8.4%
0 55150
 
7.5%
3 51265
 
6.9%
4 50156
 
6.8%
. 47966
 
6.5%
1 45741
 
6.2%
5 44030
 
5.9%
9 39569
 
5.3%
Other values (37) 139885
18.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 931984
99.3%
None 5715
 
0.6%
Punctuation 602
 
0.1%
Modifier Letters 41
 
< 0.1%
Geometric Shapes 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
128576
13.8%
2 75402
 
8.1%
62317
 
6.7%
0 55150
 
5.9%
3 51265
 
5.5%
N 50953
 
5.5%
4 50156
 
5.4%
. 47966
 
5.1%
1 45741
 
4.9%
5 44030
 
4.7%
Other values (79) 320428
34.4%
None
ValueCountFrequency (%)
° 5146
90.0%
º 477
 
8.3%
´ 82
 
1.4%
ü 4
 
0.1%
è 3
 
0.1%
é 2
 
< 0.1%
ö 1
 
< 0.1%
Punctuation
ValueCountFrequency (%)
398
66.1%
102
 
16.9%
99
 
16.4%
3
 
0.5%
Modifier Letters
ValueCountFrequency (%)
ʹ 26
63.4%
˚ 14
34.1%
ʺ 1
 
2.4%
Geometric Shapes
ValueCountFrequency (%)
2
100.0%

verbatimLongitude
Text

Missing 

Distinct13853
Distinct (%)19.4%
Missing1854475
Missing (%)96.3%
Memory size14.7 MiB
2025-01-14T11:48:09.916629image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length59
Median length40
Mean length10.07856285
Min length2

Characters and Unicode

Total characters721484
Distinct characters62
Distinct categories14 ?
Distinct scripts2 ?
Distinct blocks5 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6900 ?
Unique (%)9.6%

Sample

1st row-68.899058
2nd row139 13.45 E
3rd row-157.801784
4th row85 54.5 W
5th row-121.1546
ValueCountFrequency (%)
w 42754
 
22.5%
84 7563
 
4.0%
e 6274
 
3.3%
00 3967
 
2.1%
83 3204
 
1.7%
86 2758
 
1.5%
85 1732
 
0.9%
53 1629
 
0.9%
79 1576
 
0.8%
17 1286
 
0.7%
Other values (8986) 117328
61.7%
2025-01-14T11:48:10.186708image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
118485
16.4%
0 65196
9.0%
1 63938
8.9%
8 51373
 
7.1%
W 49328
 
6.8%
. 48187
 
6.7%
5 47280
 
6.6%
2 42177
 
5.8%
3 42152
 
5.8%
4 41303
 
5.7%
Other values (52) 152065
21.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 456784
63.3%
Space Separator 118485
 
16.4%
Uppercase Letter 59224
 
8.2%
Other Punctuation 56936
 
7.9%
Dash Punctuation 15722
 
2.2%
Lowercase Letter 8267
 
1.1%
Other Symbol 5138
 
0.7%
Other Letter 470
 
0.1%
Connector Punctuation 220
 
< 0.1%
Final Punctuation 107
 
< 0.1%
Other values (4) 131
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1217
14.7%
o 1165
14.1%
e 1093
13.2%
d 946
11.4%
g 936
11.3%
t 908
11.0%
i 901
10.9%
u 897
10.9%
a 69
 
0.8%
r 47
 
0.6%
Other values (7) 88
 
1.1%
Decimal Number
ValueCountFrequency (%)
0 65196
14.3%
1 63938
14.0%
8 51373
11.2%
5 47280
10.4%
2 42177
9.2%
3 42152
9.2%
4 41303
9.0%
7 37164
8.1%
6 34447
7.5%
9 31754
7.0%
Uppercase Letter
ValueCountFrequency (%)
W 49328
83.3%
E 7724
 
13.0%
L 1158
 
2.0%
D 440
 
0.7%
S 222
 
0.4%
G 220
 
0.4%
N 75
 
0.1%
M 54
 
0.1%
A 2
 
< 0.1%
R 1
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
. 48187
84.6%
' 4398
 
7.7%
; 2607
 
4.6%
" 984
 
1.7%
398
 
0.7%
* 149
 
0.3%
102
 
0.2%
? 72
 
0.1%
, 25
 
< 0.1%
/ 14
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 15719
> 99.9%
3
 
< 0.1%
Other Symbol
ValueCountFrequency (%)
° 5136
> 99.9%
2
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
´ 82
85.4%
˚ 14
 
14.6%
Modifier Letter
ValueCountFrequency (%)
ʹ 25
96.2%
ʺ 1
 
3.8%
Math Symbol
ValueCountFrequency (%)
~ 5
71.4%
= 2
 
28.6%
Space Separator
ValueCountFrequency (%)
118485
100.0%
Other Letter
ValueCountFrequency (%)
º 470
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 220
100.0%
Final Punctuation
ValueCountFrequency (%)
107
100.0%
Close Punctuation
ValueCountFrequency (%)
) 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 653523
90.6%
Latin 67961
 
9.4%

Most frequent character per script

Common
ValueCountFrequency (%)
118485
18.1%
0 65196
10.0%
1 63938
9.8%
8 51373
7.9%
. 48187
7.4%
5 47280
 
7.2%
2 42177
 
6.5%
3 42152
 
6.4%
4 41303
 
6.3%
7 37164
 
5.7%
Other values (24) 96268
14.7%
Latin
ValueCountFrequency (%)
W 49328
72.6%
E 7724
 
11.4%
n 1217
 
1.8%
o 1165
 
1.7%
L 1158
 
1.7%
e 1093
 
1.6%
d 946
 
1.4%
g 936
 
1.4%
t 908
 
1.3%
i 901
 
1.3%
Other values (18) 2585
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 715144
99.1%
None 5688
 
0.8%
Punctuation 610
 
0.1%
Modifier Letters 40
 
< 0.1%
Geometric Shapes 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
118485
16.6%
0 65196
9.1%
1 63938
8.9%
8 51373
 
7.2%
W 49328
 
6.9%
. 48187
 
6.7%
5 47280
 
6.6%
2 42177
 
5.9%
3 42152
 
5.9%
4 41303
 
5.8%
Other values (41) 145725
20.4%
None
ValueCountFrequency (%)
° 5136
90.3%
º 470
 
8.3%
´ 82
 
1.4%
Punctuation
ValueCountFrequency (%)
398
65.2%
107
 
17.5%
102
 
16.7%
3
 
0.5%
Modifier Letters
ValueCountFrequency (%)
ʹ 25
62.5%
˚ 14
35.0%
ʺ 1
 
2.5%
Geometric Shapes
ValueCountFrequency (%)
2
100.0%
Distinct9
Distinct (%)< 0.1%
Missing1246668
Missing (%)64.7%
Memory size14.7 MiB
2025-01-14T11:48:10.262249image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length23
Median length23
Mean length22.60570097
Min length3

Characters and Unicode

Total characters15358155
Distinct characters30
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowDegrees Minutes Seconds
2nd rowDegrees Minutes Seconds
3rd rowDegrees Minutes Seconds
4th rowDegrees Minutes Seconds
5th rowDegrees Minutes Seconds
ValueCountFrequency (%)
degrees 670787
33.4%
minutes 648088
32.3%
seconds 648088
32.3%
decimal 22699
 
1.1%
township 7003
 
0.3%
range 7003
 
0.3%
marsden 604
 
< 0.1%
square 604
 
< 0.1%
unknown 532
 
< 0.1%
utm 464
 
< 0.1%
Other values (3) 6
 
< 0.1%
2025-01-14T11:48:10.394567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3339448
21.7%
s 1974570
12.9%
1326485
 
8.6%
n 1312382
 
8.5%
g 677790
 
4.4%
i 677790
 
4.4%
r 671998
 
4.4%
d 671349
 
4.4%
D 670832
 
4.4%
c 670788
 
4.4%
Other values (20) 3364723
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12047514
78.4%
Uppercase Letter 1984156
 
12.9%
Space Separator 1326485
 
8.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3339448
27.7%
s 1974570
16.4%
n 1312382
 
10.9%
g 677790
 
5.6%
i 677790
 
5.6%
r 671998
 
5.6%
d 671349
 
5.6%
c 670788
 
5.6%
o 655625
 
5.4%
u 648695
 
5.4%
Other values (9) 747079
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
D 670832
33.8%
M 649156
32.7%
S 648692
32.7%
T 7467
 
0.4%
R 7003
 
0.4%
U 998
 
0.1%
Q 3
 
< 0.1%
A 2
 
< 0.1%
F 2
 
< 0.1%
G 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1326485
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14031670
91.4%
Common 1326485
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3339448
23.8%
s 1974570
14.1%
n 1312382
 
9.4%
g 677790
 
4.8%
i 677790
 
4.8%
r 671998
 
4.8%
d 671349
 
4.8%
D 670832
 
4.8%
c 670788
 
4.8%
o 655625
 
4.7%
Other values (19) 2709098
19.3%
Common
ValueCountFrequency (%)
1326485
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15358155
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3339448
21.7%
s 1974570
12.9%
1326485
 
8.6%
n 1312382
 
8.5%
g 677790
 
4.4%
i 677790
 
4.4%
r 671998
 
4.4%
d 671349
 
4.4%
D 670832
 
4.4%
c 670788
 
4.4%
Other values (20) 3364723
21.9%

footprintSRS
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:10.446625image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length10
Min length7

Characters and Unicode

Total characters20
Distinct characters14
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowAlgeria
2nd rowUnited States
ValueCountFrequency (%)
algeria 1
33.3%
united 1
33.3%
states 1
33.3%
2025-01-14T11:48:10.561847image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 3
15.0%
t 3
15.0%
i 2
10.0%
a 2
10.0%
A 1
 
5.0%
l 1
 
5.0%
g 1
 
5.0%
r 1
 
5.0%
U 1
 
5.0%
n 1
 
5.0%
Other values (4) 4
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
80.0%
Uppercase Letter 3
 
15.0%
Space Separator 1
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3
18.8%
t 3
18.8%
i 2
12.5%
a 2
12.5%
l 1
 
6.2%
g 1
 
6.2%
r 1
 
6.2%
n 1
 
6.2%
d 1
 
6.2%
s 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
A 1
33.3%
U 1
33.3%
S 1
33.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
95.0%
Common 1
 
5.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3
15.8%
t 3
15.8%
i 2
10.5%
a 2
10.5%
A 1
 
5.3%
l 1
 
5.3%
g 1
 
5.3%
r 1
 
5.3%
U 1
 
5.3%
n 1
 
5.3%
Other values (3) 3
15.8%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 3
15.0%
t 3
15.0%
i 2
10.0%
a 2
10.0%
A 1
 
5.0%
l 1
 
5.0%
g 1
 
5.0%
r 1
 
5.0%
U 1
 
5.0%
n 1
 
5.0%
Other values (4) 4
20.0%

georeferencedBy
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:10.610131image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length5
Median length5
Mean length5
Min length5

Characters and Unicode

Total characters5
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowIdaho
ValueCountFrequency (%)
idaho 1
100.0%
2025-01-14T11:48:10.717618image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
I 1
20.0%
d 1
20.0%
a 1
20.0%
h 1
20.0%
o 1
20.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4
80.0%
Uppercase Letter 1
 
20.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
d 1
25.0%
a 1
25.0%
h 1
25.0%
o 1
25.0%
Uppercase Letter
ValueCountFrequency (%)
I 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
I 1
20.0%
d 1
20.0%
a 1
20.0%
h 1
20.0%
o 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 5
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
I 1
20.0%
d 1
20.0%
a 1
20.0%
h 1
20.0%
o 1
20.0%

georeferenceProtocol
Text

Missing 

Distinct113
Distinct (%)< 0.1%
Missing1265567
Missing (%)65.7%
Memory size14.7 MiB
2025-01-14T11:48:10.813191image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length87
Median length20
Mean length20.10035065
Min length3

Characters and Unicode

Total characters13276161
Distinct characters64
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique20 ?
Unique (%)< 0.1%

Sample

1st rowunknown, from legacy
2nd rowunknown, from legacy
3rd rowunknown, from legacy
4th rowunknown, from legacy
5th rowunknown, from legacy
ValueCountFrequency (%)
from 508975
26.2%
unknown 507502
26.1%
legacy 505051
26.0%
geolocate 70300
 
3.6%
names 41929
 
2.2%
geographic 41548
 
2.1%
of 35272
 
1.8%
getty 34680
 
1.8%
thesaurus 34679
 
1.8%
may 23185
 
1.2%
Other values (125) 141489
 
7.3%
2025-01-14T11:48:10.994833image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 1560576
 
11.8%
1284116
 
9.7%
o 1253199
 
9.4%
e 821923
 
6.2%
a 796893
 
6.0%
r 641911
 
4.8%
c 624543
 
4.7%
g 591212
 
4.5%
u 580658
 
4.4%
y 577336
 
4.3%
Other values (54) 4543794
34.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 10759532
81.0%
Space Separator 1284116
 
9.7%
Uppercase Letter 560035
 
4.2%
Other Punctuation 551518
 
4.2%
Decimal Number 114440
 
0.9%
Dash Punctuation 3268
 
< 0.1%
Close Punctuation 1624
 
< 0.1%
Open Punctuation 1624
 
< 0.1%
Connector Punctuation 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n 1560576
14.5%
o 1253199
11.6%
e 821923
 
7.6%
a 796893
 
7.4%
r 641911
 
6.0%
c 624543
 
5.8%
g 591212
 
5.5%
u 580658
 
5.4%
y 577336
 
5.4%
m 572844
 
5.3%
Other values (14) 2738437
25.5%
Uppercase Letter
ValueCountFrequency (%)
G 185792
33.2%
L 76688
13.7%
E 75159
13.4%
O 56827
 
10.1%
N 43892
 
7.8%
T 36730
 
6.6%
M 26360
 
4.7%
S 23936
 
4.3%
U 8296
 
1.5%
I 8275
 
1.5%
Other values (9) 18080
 
3.2%
Decimal Number
ValueCountFrequency (%)
0 52662
46.0%
2 49519
43.3%
9 5897
 
5.2%
4 2925
 
2.6%
1 1974
 
1.7%
5 1442
 
1.3%
8 15
 
< 0.1%
7 4
 
< 0.1%
3 2
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
, 528902
95.9%
/ 9410
 
1.7%
. 9109
 
1.7%
: 3482
 
0.6%
& 594
 
0.1%
! 18
 
< 0.1%
' 3
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1284116
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 3268
100.0%
Close Punctuation
ValueCountFrequency (%)
) 1624
100.0%
Open Punctuation
ValueCountFrequency (%)
( 1624
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 11319567
85.3%
Common 1956594
 
14.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
n 1560576
13.8%
o 1253199
 
11.1%
e 821923
 
7.3%
a 796893
 
7.0%
r 641911
 
5.7%
c 624543
 
5.5%
g 591212
 
5.2%
u 580658
 
5.1%
y 577336
 
5.1%
m 572844
 
5.1%
Other values (33) 3298472
29.1%
Common
ValueCountFrequency (%)
1284116
65.6%
, 528902
27.0%
0 52662
 
2.7%
2 49519
 
2.5%
/ 9410
 
0.5%
. 9109
 
0.5%
9 5897
 
0.3%
: 3482
 
0.2%
- 3268
 
0.2%
4 2925
 
0.1%
Other values (11) 7304
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13276161
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n 1560576
 
11.8%
1284116
 
9.7%
o 1253199
 
9.4%
e 821923
 
6.2%
a 796893
 
6.0%
r 641911
 
4.8%
c 624543
 
4.7%
g 591212
 
4.5%
u 580658
 
4.4%
y 577336
 
4.3%
Other values (54) 4543794
34.2%

georeferenceSources
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing1926058
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:11.055508image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length26
Median length13
Mean length15.66666667
Min length8

Characters and Unicode

Total characters47
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowBoutaara
2nd rowBeveridge, I.
3rd rowDenton, J. F.; Byrd, E. E.
ValueCountFrequency (%)
e 2
22.2%
boutaara 1
11.1%
beveridge 1
11.1%
i 1
11.1%
denton 1
11.1%
j 1
11.1%
f 1
11.1%
byrd 1
11.1%
2025-01-14T11:48:11.173870image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
6
12.8%
. 5
 
10.6%
e 4
 
8.5%
B 3
 
6.4%
r 3
 
6.4%
, 3
 
6.4%
a 3
 
6.4%
d 2
 
4.3%
o 2
 
4.3%
t 2
 
4.3%
Other values (12) 14
29.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23
48.9%
Other Punctuation 9
 
19.1%
Uppercase Letter 9
 
19.1%
Space Separator 6
 
12.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 4
17.4%
r 3
13.0%
a 3
13.0%
d 2
8.7%
o 2
8.7%
t 2
8.7%
n 2
8.7%
v 1
 
4.3%
i 1
 
4.3%
g 1
 
4.3%
Other values (2) 2
8.7%
Uppercase Letter
ValueCountFrequency (%)
B 3
33.3%
E 2
22.2%
I 1
 
11.1%
D 1
 
11.1%
J 1
 
11.1%
F 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
. 5
55.6%
, 3
33.3%
; 1
 
11.1%
Space Separator
ValueCountFrequency (%)
6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 32
68.1%
Common 15
31.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 4
12.5%
B 3
 
9.4%
r 3
 
9.4%
a 3
 
9.4%
d 2
 
6.2%
o 2
 
6.2%
t 2
 
6.2%
n 2
 
6.2%
E 2
 
6.2%
v 1
 
3.1%
Other values (8) 8
25.0%
Common
ValueCountFrequency (%)
6
40.0%
. 5
33.3%
, 3
20.0%
; 1
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 47
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
6
12.8%
. 5
 
10.6%
e 4
 
8.5%
B 3
 
6.4%
r 3
 
6.4%
, 3
 
6.4%
a 3
 
6.4%
d 2
 
4.3%
o 2
 
4.3%
t 2
 
4.3%
Other values (12) 14
29.8%

georeferenceRemarks
Text

Missing 

Distinct4818
Distinct (%)15.9%
Missing1895791
Missing (%)98.4%
Memory size14.7 MiB
2025-01-14T11:48:11.361636image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length122
Median length118
Mean length23.035778
Min length1

Characters and Unicode

Total characters697293
Distinct characters78
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3162 ?
Unique (%)10.4%

Sample

1st rowExtended About 16 Km Offshore From Crystal River Power Plant
2nd row0.8 mile west of Montgomery-Polk county line, north side of
3rd rowSan Andreas Fault
4th row6 Mile W Of Watsonville
5th rowfrom Holt data card
ValueCountFrequency (%)
approximate 9788
 
9.0%
from 6466
 
5.9%
river 3462
 
3.2%
of 3094
 
2.8%
about 3075
 
2.8%
16 2973
 
2.7%
km 2969
 
2.7%
plant 2932
 
2.7%
offshore 2928
 
2.7%
power 2928
 
2.7%
Other values (4966) 68697
62.8%
2025-01-14T11:48:11.654332image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
79042
 
11.3%
a 60472
 
8.7%
e 55627
 
8.0%
o 49156
 
7.0%
r 47472
 
6.8%
t 40214
 
5.8%
i 29459
 
4.2%
n 26669
 
3.8%
p 24667
 
3.5%
m 24216
 
3.5%
Other values (68) 260299
37.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 519508
74.5%
Space Separator 79042
 
11.3%
Uppercase Letter 71753
 
10.3%
Decimal Number 14975
 
2.1%
Other Punctuation 10482
 
1.5%
Close Punctuation 574
 
0.1%
Open Punctuation 570
 
0.1%
Dash Punctuation 354
 
0.1%
Math Symbol 35
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 60472
11.6%
e 55627
10.7%
o 49156
 
9.5%
r 47472
 
9.1%
t 40214
 
7.7%
i 29459
 
5.7%
n 26669
 
5.1%
p 24667
 
4.7%
m 24216
 
4.7%
l 23875
 
4.6%
Other values (16) 137681
26.5%
Uppercase Letter
ValueCountFrequency (%)
P 8825
12.3%
R 7367
10.3%
C 6870
 
9.6%
O 6475
 
9.0%
B 4745
 
6.6%
A 4362
 
6.1%
F 4158
 
5.8%
E 3946
 
5.5%
S 3869
 
5.4%
K 3631
 
5.1%
Other values (16) 17505
24.4%
Decimal Number
ValueCountFrequency (%)
1 4138
27.6%
6 3402
22.7%
5 1659
11.1%
0 1440
 
9.6%
3 1434
 
9.6%
2 951
 
6.4%
4 876
 
5.8%
7 488
 
3.3%
8 411
 
2.7%
9 176
 
1.2%
Other Punctuation
ValueCountFrequency (%)
, 5315
50.7%
. 2055
 
19.6%
/ 1922
 
18.3%
: 460
 
4.4%
' 363
 
3.5%
; 283
 
2.7%
" 42
 
0.4%
& 23
 
0.2%
# 19
 
0.2%
Close Punctuation
ValueCountFrequency (%)
) 564
98.3%
] 10
 
1.7%
Open Punctuation
ValueCountFrequency (%)
( 560
98.2%
[ 10
 
1.8%
Space Separator
ValueCountFrequency (%)
79042
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 354
100.0%
Math Symbol
ValueCountFrequency (%)
+ 35
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 591261
84.8%
Common 106032
 
15.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 60472
 
10.2%
e 55627
 
9.4%
o 49156
 
8.3%
r 47472
 
8.0%
t 40214
 
6.8%
i 29459
 
5.0%
n 26669
 
4.5%
p 24667
 
4.2%
m 24216
 
4.1%
l 23875
 
4.0%
Other values (42) 209434
35.4%
Common
ValueCountFrequency (%)
79042
74.5%
, 5315
 
5.0%
1 4138
 
3.9%
6 3402
 
3.2%
. 2055
 
1.9%
/ 1922
 
1.8%
5 1659
 
1.6%
0 1440
 
1.4%
3 1434
 
1.4%
2 951
 
0.9%
Other values (16) 4674
 
4.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 697293
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
79042
 
11.3%
a 60472
 
8.7%
e 55627
 
8.0%
o 49156
 
7.0%
r 47472
 
6.8%
t 40214
 
5.8%
i 29459
 
4.2%
n 26669
 
3.8%
p 24667
 
3.5%
m 24216
 
3.5%
Other values (68) 260299
37.3%

geologicalContextID
Text

Missing 

Distinct3
Distinct (%)100.0%
Missing1926058
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:11.727428image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length75
Median length37
Mean length46.66666667
Min length28

Characters and Unicode

Total characters140
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)100.0%

Sample

1st rowNorth America, North Pacific Ocean, Departure Bay, Canada, British Columbia
2nd rowNorth America, United States, Georgia
3rd rowNorth America, United States
ValueCountFrequency (%)
north 4
21.1%
america 3
15.8%
united 2
10.5%
states 2
10.5%
pacific 1
 
5.3%
ocean 1
 
5.3%
departure 1
 
5.3%
bay 1
 
5.3%
canada 1
 
5.3%
british 1
 
5.3%
Other values (2) 2
10.5%
2025-01-14T11:48:11.843952image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
16
 
11.4%
a 14
 
10.0%
t 12
 
8.6%
r 11
 
7.9%
e 11
 
7.9%
i 11
 
7.9%
, 7
 
5.0%
o 6
 
4.3%
c 6
 
4.3%
h 5
 
3.6%
Other values (21) 41
29.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 98
70.0%
Uppercase Letter 19
 
13.6%
Space Separator 16
 
11.4%
Other Punctuation 7
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 14
14.3%
t 12
12.2%
r 11
11.2%
e 11
11.2%
i 11
11.2%
o 6
6.1%
c 6
6.1%
h 5
 
5.1%
n 4
 
4.1%
m 4
 
4.1%
Other values (9) 14
14.3%
Uppercase Letter
ValueCountFrequency (%)
N 4
21.1%
A 3
15.8%
S 2
10.5%
U 2
10.5%
B 2
10.5%
C 2
10.5%
G 1
 
5.3%
O 1
 
5.3%
D 1
 
5.3%
P 1
 
5.3%
Space Separator
ValueCountFrequency (%)
16
100.0%
Other Punctuation
ValueCountFrequency (%)
, 7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 117
83.6%
Common 23
 
16.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 14
12.0%
t 12
 
10.3%
r 11
 
9.4%
e 11
 
9.4%
i 11
 
9.4%
o 6
 
5.1%
c 6
 
5.1%
h 5
 
4.3%
n 4
 
3.4%
N 4
 
3.4%
Other values (19) 33
28.2%
Common
ValueCountFrequency (%)
16
69.6%
, 7
30.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 140
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
16
 
11.4%
a 14
 
10.0%
t 12
 
8.6%
r 11
 
7.9%
e 11
 
7.9%
i 11
 
7.9%
, 7
 
5.0%
o 6
 
4.3%
c 6
 
4.3%
h 5
 
3.6%
Other values (21) 41
29.3%
Distinct2
Distinct (%)66.7%
Missing1926058
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:11.897016image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length13
Mean length20
Min length13

Characters and Unicode

Total characters60
Distinct characters17
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)33.3%

Sample

1st rowNorth America, North Pacific Ocean
2nd rowNorth America
3rd rowNorth America
ValueCountFrequency (%)
north 4
44.4%
america 3
33.3%
pacific 1
 
11.1%
ocean 1
 
11.1%
2025-01-14T11:48:11.993590image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
r 7
11.7%
c 6
10.0%
6
10.0%
a 5
8.3%
i 5
8.3%
N 4
 
6.7%
o 4
 
6.7%
e 4
 
6.7%
h 4
 
6.7%
t 4
 
6.7%
Other values (7) 11
18.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 44
73.3%
Uppercase Letter 9
 
15.0%
Space Separator 6
 
10.0%
Other Punctuation 1
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r 7
15.9%
c 6
13.6%
a 5
11.4%
i 5
11.4%
o 4
9.1%
e 4
9.1%
h 4
9.1%
t 4
9.1%
m 3
6.8%
f 1
 
2.3%
Uppercase Letter
ValueCountFrequency (%)
N 4
44.4%
A 3
33.3%
P 1
 
11.1%
O 1
 
11.1%
Space Separator
ValueCountFrequency (%)
6
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 53
88.3%
Common 7
 
11.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
r 7
13.2%
c 6
11.3%
a 5
9.4%
i 5
9.4%
N 4
7.5%
o 4
7.5%
e 4
7.5%
h 4
7.5%
t 4
7.5%
m 3
5.7%
Other values (5) 7
13.2%
Common
ValueCountFrequency (%)
6
85.7%
, 1
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 60
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
r 7
11.7%
c 6
10.0%
6
10.0%
a 5
8.3%
i 5
8.3%
N 4
 
6.7%
o 4
 
6.7%
e 4
 
6.7%
h 4
 
6.7%
t 4
 
6.7%
Other values (7) 11
18.3%
Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:12.049535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length34
Median length20.5
Mean length20.5
Min length7

Characters and Unicode

Total characters41
Distinct characters26
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowNorth Pacific Ocean, Departure Bay
2nd rowAL-1419
ValueCountFrequency (%)
north 1
16.7%
pacific 1
16.7%
ocean 1
16.7%
departure 1
16.7%
bay 1
16.7%
al-1419 1
16.7%
2025-01-14T11:48:12.160511image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
9.8%
a 4
 
9.8%
r 3
 
7.3%
c 3
 
7.3%
e 3
 
7.3%
t 2
 
4.9%
1 2
 
4.9%
i 2
 
4.9%
N 1
 
2.4%
u 1
 
2.4%
Other values (16) 16
39.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 24
58.5%
Uppercase Letter 7
 
17.1%
Space Separator 4
 
9.8%
Decimal Number 4
 
9.8%
Dash Punctuation 1
 
2.4%
Other Punctuation 1
 
2.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
16.7%
r 3
12.5%
c 3
12.5%
e 3
12.5%
t 2
8.3%
i 2
8.3%
u 1
 
4.2%
y 1
 
4.2%
n 1
 
4.2%
p 1
 
4.2%
Other values (3) 3
12.5%
Uppercase Letter
ValueCountFrequency (%)
N 1
14.3%
L 1
14.3%
A 1
14.3%
B 1
14.3%
D 1
14.3%
O 1
14.3%
P 1
14.3%
Decimal Number
ValueCountFrequency (%)
1 2
50.0%
4 1
25.0%
9 1
25.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
, 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 31
75.6%
Common 10
 
24.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
 
12.9%
r 3
 
9.7%
c 3
 
9.7%
e 3
 
9.7%
t 2
 
6.5%
i 2
 
6.5%
N 1
 
3.2%
u 1
 
3.2%
L 1
 
3.2%
A 1
 
3.2%
Other values (10) 10
32.3%
Common
ValueCountFrequency (%)
4
40.0%
1 2
20.0%
4 1
 
10.0%
- 1
 
10.0%
, 1
 
10.0%
9 1
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4
 
9.8%
a 4
 
9.8%
r 3
 
7.3%
c 3
 
7.3%
e 3
 
7.3%
t 2
 
4.9%
1 2
 
4.9%
i 2
 
4.9%
N 1
 
2.4%
u 1
 
2.4%
Other values (16) 16
39.0%
Distinct9
Distinct (%)100.0%
Missing1926052
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:12.224419image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length8.666666667
Min length4

Characters and Unicode

Total characters78
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique9 ?
Unique (%)100.0%

Sample

1st row1911-09-29
2nd row1984-04-14
3rd row1997-04-24
4th row1962-06-19
5th row1935-06-26
ValueCountFrequency (%)
1911-09-29 1
11.1%
1984-04-14 1
11.1%
1997-04-24 1
11.1%
1962-06-19 1
11.1%
1935-06-26 1
11.1%
1984-07-25 1
11.1%
1931 1
11.1%
1935-07-15 1
11.1%
1957 1
11.1%
2025-01-14T11:48:12.350952image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 15
19.2%
- 14
17.9%
9 13
16.7%
0 7
9.0%
4 6
 
7.7%
2 5
 
6.4%
5 5
 
6.4%
7 4
 
5.1%
6 4
 
5.1%
3 3
 
3.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 64
82.1%
Dash Punctuation 14
 
17.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 15
23.4%
9 13
20.3%
0 7
10.9%
4 6
 
9.4%
2 5
 
7.8%
5 5
 
7.8%
7 4
 
6.2%
6 4
 
6.2%
3 3
 
4.7%
8 2
 
3.1%
Dash Punctuation
ValueCountFrequency (%)
- 14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 78
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 15
19.2%
- 14
17.9%
9 13
16.7%
0 7
9.0%
4 6
 
7.7%
2 5
 
6.4%
5 5
 
6.4%
7 4
 
5.1%
6 4
 
5.1%
3 3
 
3.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 78
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 15
19.2%
- 14
17.9%
9 13
16.7%
0 7
9.0%
4 6
 
7.7%
2 5
 
6.4%
5 5
 
6.4%
7 4
 
5.1%
6 4
 
5.1%
3 3
 
3.8%
Distinct9
Distinct (%)90.0%
Missing1926051
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:12.405567image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length3
Mean length5.3
Min length3

Characters and Unicode

Total characters53
Distinct characters19
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)80.0%

Sample

1st row272
2nd row105
3rd rowCanada
4th row114
5th row170
ValueCountFrequency (%)
united 2
16.7%
states 2
16.7%
272 1
8.3%
105 1
8.3%
canada 1
8.3%
114 1
8.3%
170 1
8.3%
177 1
8.3%
207 1
8.3%
196 1
8.3%
2025-01-14T11:48:12.514876image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 6
11.3%
1 6
11.3%
7 5
 
9.4%
a 5
 
9.4%
e 4
 
7.5%
2 3
 
5.7%
0 3
 
5.7%
d 3
 
5.7%
n 3
 
5.7%
U 2
 
3.8%
Other values (9) 13
24.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 25
47.2%
Decimal Number 21
39.6%
Uppercase Letter 5
 
9.4%
Space Separator 2
 
3.8%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 6
28.6%
7 5
23.8%
2 3
14.3%
0 3
14.3%
5 1
 
4.8%
4 1
 
4.8%
9 1
 
4.8%
6 1
 
4.8%
Lowercase Letter
ValueCountFrequency (%)
t 6
24.0%
a 5
20.0%
e 4
16.0%
d 3
12.0%
n 3
12.0%
s 2
 
8.0%
i 2
 
8.0%
Uppercase Letter
ValueCountFrequency (%)
U 2
40.0%
S 2
40.0%
C 1
20.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 30
56.6%
Common 23
43.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 6
20.0%
a 5
16.7%
e 4
13.3%
d 3
10.0%
n 3
10.0%
U 2
 
6.7%
s 2
 
6.7%
S 2
 
6.7%
i 2
 
6.7%
C 1
 
3.3%
Common
ValueCountFrequency (%)
1 6
26.1%
7 5
21.7%
2 3
13.0%
0 3
13.0%
2
 
8.7%
5 1
 
4.3%
4 1
 
4.3%
9 1
 
4.3%
6 1
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 53
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 6
11.3%
1 6
11.3%
7 5
 
9.4%
a 5
 
9.4%
e 4
 
7.5%
2 3
 
5.7%
0 3
 
5.7%
d 3
 
5.7%
n 3
 
5.7%
U 2
 
3.8%
Other values (9) 13
24.5%
Distinct7
Distinct (%)100.0%
Missing1926054
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:12.565899image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters21
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row272
2nd row105
3rd row114
4th row170
5th row177
ValueCountFrequency (%)
272 1
14.3%
105 1
14.3%
114 1
14.3%
170 1
14.3%
177 1
14.3%
207 1
14.3%
196 1
14.3%
2025-01-14T11:48:12.669932image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 6
28.6%
7 5
23.8%
2 3
14.3%
0 3
14.3%
5 1
 
4.8%
4 1
 
4.8%
9 1
 
4.8%
6 1
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 21
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 6
28.6%
7 5
23.8%
2 3
14.3%
0 3
14.3%
5 1
 
4.8%
4 1
 
4.8%
9 1
 
4.8%
6 1
 
4.8%

Most occurring scripts

ValueCountFrequency (%)
Common 21
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 6
28.6%
7 5
23.8%
2 3
14.3%
0 3
14.3%
5 1
 
4.8%
4 1
 
4.8%
9 1
 
4.8%
6 1
 
4.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 21
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 6
28.6%
7 5
23.8%
2 3
14.3%
0 3
14.3%
5 1
 
4.8%
4 1
 
4.8%
9 1
 
4.8%
6 1
 
4.8%
Distinct9
Distinct (%)81.8%
Missing1926050
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:12.727236image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length16
Median length4
Mean length5.363636364
Min length4

Characters and Unicode

Total characters59
Distinct characters26
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)63.6%

Sample

1st row1911
2nd row1984
3rd rowBritish Columbia
4th row1997
5th row1962
ValueCountFrequency (%)
1984 2
16.7%
1935 2
16.7%
1911 1
8.3%
british 1
8.3%
columbia 1
8.3%
1997 1
8.3%
1962 1
8.3%
georgia 1
8.3%
1931 1
8.3%
1957 1
8.3%
2025-01-14T11:48:12.843902image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 12
20.3%
9 10
16.9%
i 4
 
6.8%
3 3
 
5.1%
5 3
 
5.1%
4 2
 
3.4%
r 2
 
3.4%
7 2
 
3.4%
8 2
 
3.4%
o 2
 
3.4%
Other values (16) 17
28.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 36
61.0%
Lowercase Letter 19
32.2%
Uppercase Letter 3
 
5.1%
Space Separator 1
 
1.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 4
21.1%
r 2
10.5%
o 2
10.5%
a 2
10.5%
e 1
 
5.3%
m 1
 
5.3%
b 1
 
5.3%
u 1
 
5.3%
l 1
 
5.3%
h 1
 
5.3%
Other values (3) 3
15.8%
Decimal Number
ValueCountFrequency (%)
1 12
33.3%
9 10
27.8%
3 3
 
8.3%
5 3
 
8.3%
4 2
 
5.6%
7 2
 
5.6%
8 2
 
5.6%
2 1
 
2.8%
6 1
 
2.8%
Uppercase Letter
ValueCountFrequency (%)
G 1
33.3%
C 1
33.3%
B 1
33.3%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 37
62.7%
Latin 22
37.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 4
18.2%
r 2
 
9.1%
o 2
 
9.1%
a 2
 
9.1%
e 1
 
4.5%
m 1
 
4.5%
G 1
 
4.5%
b 1
 
4.5%
C 1
 
4.5%
u 1
 
4.5%
Other values (6) 6
27.3%
Common
ValueCountFrequency (%)
1 12
32.4%
9 10
27.0%
3 3
 
8.1%
5 3
 
8.1%
4 2
 
5.4%
7 2
 
5.4%
8 2
 
5.4%
2 1
 
2.7%
6 1
 
2.7%
1
 
2.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 59
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 12
20.3%
9 10
16.9%
i 4
 
6.8%
3 3
 
5.1%
5 3
 
5.1%
4 2
 
3.4%
r 2
 
3.4%
7 2
 
3.4%
8 2
 
3.4%
o 2
 
3.4%
Other values (16) 17
28.8%
Distinct4
Distinct (%)57.1%
Missing1926054
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:12.892078image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters7
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)14.3%

Sample

1st row9
2nd row4
3rd row4
4th row6
5th row6
ValueCountFrequency (%)
4 2
28.6%
6 2
28.6%
7 2
28.6%
9 1
14.3%
2025-01-14T11:48:13.175736image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4 2
28.6%
6 2
28.6%
7 2
28.6%
9 1
14.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 7
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
4 2
28.6%
6 2
28.6%
7 2
28.6%
9 1
14.3%

Most occurring scripts

ValueCountFrequency (%)
Common 7
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
4 2
28.6%
6 2
28.6%
7 2
28.6%
9 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 7
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4 2
28.6%
6 2
28.6%
7 2
28.6%
9 1
14.3%
Distinct7
Distinct (%)100.0%
Missing1926054
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:13.230304image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters14
Distinct characters6
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)100.0%

Sample

1st row29
2nd row14
3rd row24
4th row19
5th row26
ValueCountFrequency (%)
29 1
14.3%
14 1
14.3%
24 1
14.3%
19 1
14.3%
26 1
14.3%
25 1
14.3%
15 1
14.3%
2025-01-14T11:48:13.339088image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 4
28.6%
1 3
21.4%
9 2
14.3%
4 2
14.3%
5 2
14.3%
6 1
 
7.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 14
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 4
28.6%
1 3
21.4%
9 2
14.3%
4 2
14.3%
5 2
14.3%
6 1
 
7.1%

Most occurring scripts

ValueCountFrequency (%)
Common 14
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 4
28.6%
1 3
21.4%
9 2
14.3%
4 2
14.3%
5 2
14.3%
6 1
 
7.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 4
28.6%
1 3
21.4%
9 2
14.3%
4 2
14.3%
5 2
14.3%
6 1
 
7.1%

latestAgeOrHighestStage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:13.384415image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowMoultrie
ValueCountFrequency (%)
moultrie 1
100.0%
2025-01-14T11:48:13.483724image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 1
12.5%
o 1
12.5%
u 1
12.5%
l 1
12.5%
t 1
12.5%
r 1
12.5%
i 1
12.5%
e 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7
87.5%
Uppercase Letter 1
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1
14.3%
u 1
14.3%
l 1
14.3%
t 1
14.3%
r 1
14.3%
i 1
14.3%
e 1
14.3%
Uppercase Letter
ValueCountFrequency (%)
M 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 1
12.5%
o 1
12.5%
u 1
12.5%
l 1
12.5%
t 1
12.5%
r 1
12.5%
i 1
12.5%
e 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 1
12.5%
o 1
12.5%
u 1
12.5%
l 1
12.5%
t 1
12.5%
r 1
12.5%
i 1
12.5%
e 1
12.5%
Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:13.540457image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length20.5
Mean length20.5
Min length19

Characters and Unicode

Total characters41
Distinct characters17
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowHemionchos striatus
2nd rowConspicuum icteridorum
ValueCountFrequency (%)
hemionchos 1
25.0%
striatus 1
25.0%
conspicuum 1
25.0%
icteridorum 1
25.0%
2025-01-14T11:48:13.662234image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 5
12.2%
s 4
9.8%
u 4
9.8%
o 4
9.8%
m 3
 
7.3%
c 3
 
7.3%
t 3
 
7.3%
r 3
 
7.3%
n 2
 
4.9%
e 2
 
4.9%
Other values (7) 8
19.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 37
90.2%
Space Separator 2
 
4.9%
Uppercase Letter 2
 
4.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 5
13.5%
s 4
10.8%
u 4
10.8%
o 4
10.8%
m 3
8.1%
c 3
8.1%
t 3
8.1%
r 3
8.1%
n 2
 
5.4%
e 2
 
5.4%
Other values (4) 4
10.8%
Uppercase Letter
ValueCountFrequency (%)
C 1
50.0%
H 1
50.0%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 39
95.1%
Common 2
 
4.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 5
12.8%
s 4
10.3%
u 4
10.3%
o 4
10.3%
m 3
7.7%
c 3
7.7%
t 3
7.7%
r 3
7.7%
n 2
 
5.1%
e 2
 
5.1%
Other values (6) 6
15.4%
Common
ValueCountFrequency (%)
2
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 41
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 5
12.2%
s 4
9.8%
u 4
9.8%
o 4
9.8%
m 3
 
7.3%
c 3
 
7.3%
t 3
 
7.3%
r 3
 
7.3%
n 2
 
4.9%
e 2
 
4.9%
Other values (7) 8
19.5%
Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:13.725548image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length76
Median length55
Mean length55
Min length34

Characters and Unicode

Total characters110
Distinct characters22
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowAnimalia, Platyhelminthes, Cestoda
2nd rowAnimalia, Platyhelminthes, Trematoda, Digenea, Plagiorchiida, Dicrocoeliidae
ValueCountFrequency (%)
animalia 2
22.2%
platyhelminthes 2
22.2%
cestoda 1
11.1%
trematoda 1
11.1%
digenea 1
11.1%
plagiorchiida 1
11.1%
dicrocoeliidae 1
11.1%
2025-01-14T11:48:13.836792image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 13
11.8%
a 13
11.8%
e 10
 
9.1%
l 8
 
7.3%
, 7
 
6.4%
7
 
6.4%
t 6
 
5.5%
h 5
 
4.5%
o 5
 
4.5%
m 5
 
4.5%
Other values (12) 31
28.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 87
79.1%
Uppercase Letter 9
 
8.2%
Other Punctuation 7
 
6.4%
Space Separator 7
 
6.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 13
14.9%
a 13
14.9%
e 10
11.5%
l 8
9.2%
t 6
6.9%
h 5
 
5.7%
o 5
 
5.7%
m 5
 
5.7%
n 5
 
5.7%
d 4
 
4.6%
Other values (5) 13
14.9%
Uppercase Letter
ValueCountFrequency (%)
P 3
33.3%
D 2
22.2%
A 2
22.2%
C 1
 
11.1%
T 1
 
11.1%
Other Punctuation
ValueCountFrequency (%)
, 7
100.0%
Space Separator
ValueCountFrequency (%)
7
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 96
87.3%
Common 14
 
12.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 13
13.5%
a 13
13.5%
e 10
10.4%
l 8
 
8.3%
t 6
 
6.2%
h 5
 
5.2%
o 5
 
5.2%
m 5
 
5.2%
n 5
 
5.2%
d 4
 
4.2%
Other values (10) 22
22.9%
Common
ValueCountFrequency (%)
, 7
50.0%
7
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 110
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 13
11.8%
a 13
11.8%
e 10
 
9.1%
l 8
 
7.3%
, 7
 
6.4%
7
 
6.4%
t 6
 
5.5%
h 5
 
4.5%
o 5
 
4.5%
m 5
 
4.5%
Other values (12) 31
28.2%
Distinct16
Distinct (%)0.1%
Missing1907923
Missing (%)99.1%
Memory size14.7 MiB
2025-01-14T11:48:13.904355image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length50
Median length3
Mean length3.562189878
Min length3

Characters and Unicode

Total characters64611
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique10 ?
Unique (%)0.1%

Sample

1st rowcf.
2nd rowcf.
3rd rowuncertain
4th rowcf.
5th rowcf.
ValueCountFrequency (%)
cf 15633
86.0%
uncertain 1489
 
8.2%
aff 600
 
3.3%
near 404
 
2.2%
america 8
 
< 0.1%
north 4
 
< 0.1%
south 3
 
< 0.1%
brazil 2
 
< 0.1%
united 2
 
< 0.1%
states 2
 
< 0.1%
Other values (19) 21
 
0.1%
2025-01-14T11:48:14.036878image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
c 17136
26.5%
f 16835
26.1%
. 16233
25.1%
n 3395
 
5.3%
a 2527
 
3.9%
r 1912
 
3.0%
e 1911
 
3.0%
i 1515
 
2.3%
t 1509
 
2.3%
u 1494
 
2.3%
Other values (23) 144
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 48293
74.7%
Other Punctuation 16245
 
25.1%
Uppercase Letter 43
 
0.1%
Space Separator 30
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c 17136
35.5%
f 16835
34.9%
n 3395
 
7.0%
a 2527
 
5.2%
r 1912
 
4.0%
e 1911
 
4.0%
i 1515
 
3.1%
t 1509
 
3.1%
u 1494
 
3.1%
o 17
 
< 0.1%
Other values (8) 42
 
0.1%
Uppercase Letter
ValueCountFrequency (%)
A 12
27.9%
S 7
16.3%
U 4
 
9.3%
N 4
 
9.3%
C 4
 
9.3%
P 4
 
9.3%
B 2
 
4.7%
R 2
 
4.7%
D 1
 
2.3%
L 1
 
2.3%
Other values (2) 2
 
4.7%
Other Punctuation
ValueCountFrequency (%)
. 16233
99.9%
, 12
 
0.1%
Space Separator
ValueCountFrequency (%)
30
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 48336
74.8%
Common 16275
 
25.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
c 17136
35.5%
f 16835
34.8%
n 3395
 
7.0%
a 2527
 
5.2%
r 1912
 
4.0%
e 1911
 
4.0%
i 1515
 
3.1%
t 1509
 
3.1%
u 1494
 
3.1%
o 17
 
< 0.1%
Other values (20) 85
 
0.2%
Common
ValueCountFrequency (%)
. 16233
99.7%
30
 
0.2%
, 12
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 64611
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c 17136
26.5%
f 16835
26.1%
. 16233
25.1%
n 3395
 
5.3%
a 2527
 
3.9%
r 1912
 
3.0%
e 1911
 
3.0%
i 1515
 
2.3%
t 1509
 
2.3%
u 1494
 
2.3%
Other values (23) 144
 
0.2%

typeStatus
Text

Missing 

Distinct97
Distinct (%)0.1%
Missing1838230
Missing (%)95.4%
Memory size14.7 MiB
2025-01-14T11:48:14.104066image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length40
Median length8
Mean length7.998383259
Min length4

Characters and Unicode

Total characters702506
Distinct characters30
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique32 ?
Unique (%)< 0.1%

Sample

1st rowParatype
2nd rowHolotype
3rd rowParatype
4th rowHolotype
5th rowParatype
ValueCountFrequency (%)
paratype 41423
45.5%
holotype 26115
28.7%
syntype 10062
 
11.1%
type 5398
 
5.9%
allotype 3095
 
3.4%
paralectotype 1159
 
1.3%
1105
 
1.2%
lectotype 1071
 
1.2%
neotype 306
 
0.3%
unconfirmed 292
 
0.3%
Other values (25) 916
 
1.0%
2025-01-14T11:48:14.251143image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
y 99691
14.2%
e 92427
13.2%
p 90036
12.8%
t 87236
12.4%
a 86598
12.3%
o 59520
8.5%
P 43202
6.1%
r 43155
6.1%
l 33479
 
4.8%
H 26363
 
3.8%
Other values (20) 40799
5.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 607854
86.5%
Uppercase Letter 89837
 
12.8%
Space Separator 3111
 
0.4%
Math Symbol 1105
 
0.2%
Other Punctuation 599
 
0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
y 99691
16.4%
e 92427
15.2%
p 90036
14.8%
t 87236
14.4%
a 86598
14.2%
o 59520
9.8%
r 43155
7.1%
l 33479
 
5.5%
n 11250
 
1.9%
c 2538
 
0.4%
Other values (7) 1924
 
0.3%
Uppercase Letter
ValueCountFrequency (%)
P 43202
48.1%
H 26363
29.3%
S 10065
 
11.2%
T 5398
 
6.0%
A 3103
 
3.5%
L 1075
 
1.2%
N 336
 
0.4%
U 292
 
0.3%
C 2
 
< 0.1%
O 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3111
100.0%
Math Symbol
ValueCountFrequency (%)
+ 1105
100.0%
Other Punctuation
ValueCountFrequency (%)
; 599
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 697691
99.3%
Common 4815
 
0.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
y 99691
14.3%
e 92427
13.2%
p 90036
12.9%
t 87236
12.5%
a 86598
12.4%
o 59520
8.5%
P 43202
6.2%
r 43155
6.2%
l 33479
 
4.8%
H 26363
 
3.8%
Other values (17) 35984
 
5.2%
Common
ValueCountFrequency (%)
3111
64.6%
+ 1105
 
22.9%
; 599
 
12.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 702506
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
y 99691
14.2%
e 92427
13.2%
p 90036
12.8%
t 87236
12.4%
a 86598
12.3%
o 59520
8.5%
P 43202
6.1%
r 43155
6.1%
l 33479
 
4.8%
H 26363
 
3.8%
Other values (20) 40799
5.8%

identifiedBy
Text

Missing 

Distinct13462
Distinct (%)1.6%
Missing1085026
Missing (%)56.3%
Memory size14.7 MiB
2025-01-14T11:48:14.456290image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length226
Median length133
Mean length38.24104467
Min length2

Characters and Unicode

Total characters32162057
Distinct characters94
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4203 ?
Unique (%)0.5%

Sample

1st rowOpresko, Dennis M., Oak Ridge National Laboratory (UNITED STATES)
2nd rowNance
3rd rowMah, Christopher, (IZ), Smithsonian Institution - National Museum of Natural History (UNITED STATES)
4th rowVerrill, Addison E., Peabody Museum, Yale
5th rowJudkins, D.
ValueCountFrequency (%)
of 247151
 
5.3%
museum 200612
 
4.3%
national 197093
 
4.2%
institution 188563
 
4.1%
smithsonian 186033
 
4.0%
natural 185749
 
4.0%
history 185395
 
4.0%
united 130387
 
2.8%
states 129618
 
2.8%
87179
 
1.9%
Other values (9433) 2903748
62.6%
2025-01-14T11:48:14.745682image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3800493
 
11.8%
a 2080181
 
6.5%
i 2055895
 
6.4%
t 2012865
 
6.3%
n 1895740
 
5.9%
o 1744508
 
5.4%
e 1499826
 
4.7%
r 1384664
 
4.3%
s 1382519
 
4.3%
, 1349141
 
4.2%
Other values (84) 12956225
40.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19463991
60.5%
Uppercase Letter 5956500
 
18.5%
Space Separator 3800493
 
11.8%
Other Punctuation 2376958
 
7.4%
Open Punctuation 230274
 
0.7%
Close Punctuation 230274
 
0.7%
Dash Punctuation 97626
 
0.3%
Decimal Number 5852
 
< 0.1%
Math Symbol 89
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2080181
10.7%
i 2055895
10.6%
t 2012865
10.3%
n 1895740
9.7%
o 1744508
9.0%
e 1499826
7.7%
r 1384664
7.1%
s 1382519
7.1%
u 1079640
 
5.5%
l 969545
 
5.0%
Other values (37) 3358608
17.3%
Uppercase Letter
ValueCountFrequency (%)
S 646247
 
10.8%
N 570315
 
9.6%
M 471177
 
7.9%
I 456181
 
7.7%
T 454039
 
7.6%
H 422870
 
7.1%
E 378757
 
6.4%
A 333690
 
5.6%
D 272572
 
4.6%
C 241386
 
4.1%
Other values (18) 1709266
28.7%
Other Punctuation
ValueCountFrequency (%)
, 1349141
56.8%
. 937156
39.4%
; 64066
 
2.7%
/ 16438
 
0.7%
& 5585
 
0.2%
' 4526
 
0.2%
" 46
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
4 2732
46.7%
1 2732
46.7%
2 148
 
2.5%
0 92
 
1.6%
6 74
 
1.3%
9 74
 
1.3%
Dash Punctuation
ValueCountFrequency (%)
- 97619
> 99.9%
7
 
< 0.1%
Space Separator
ValueCountFrequency (%)
3800493
100.0%
Open Punctuation
ValueCountFrequency (%)
( 230274
100.0%
Close Punctuation
ValueCountFrequency (%)
) 230274
100.0%
Math Symbol
ValueCountFrequency (%)
+ 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 25420491
79.0%
Common 6741566
 
21.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2080181
 
8.2%
i 2055895
 
8.1%
t 2012865
 
7.9%
n 1895740
 
7.5%
o 1744508
 
6.9%
e 1499826
 
5.9%
r 1384664
 
5.4%
s 1382519
 
5.4%
u 1079640
 
4.2%
l 969545
 
3.8%
Other values (65) 9315108
36.6%
Common
ValueCountFrequency (%)
3800493
56.4%
, 1349141
 
20.0%
. 937156
 
13.9%
( 230274
 
3.4%
) 230274
 
3.4%
- 97619
 
1.4%
; 64066
 
1.0%
/ 16438
 
0.2%
& 5585
 
0.1%
' 4526
 
0.1%
Other values (9) 5994
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 32156515
> 99.9%
None 5535
 
< 0.1%
Punctuation 7
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3800493
 
11.8%
a 2080181
 
6.5%
i 2055895
 
6.4%
t 2012865
 
6.3%
n 1895740
 
5.9%
o 1744508
 
5.4%
e 1499826
 
4.7%
r 1384664
 
4.3%
s 1382519
 
4.3%
, 1349141
 
4.2%
Other values (60) 12950683
40.3%
None
ValueCountFrequency (%)
é 1458
26.3%
í 1289
23.3%
á 848
15.3%
ñ 436
 
7.9%
ã 401
 
7.2%
è 285
 
5.1%
ö 217
 
3.9%
ç 159
 
2.9%
ó 99
 
1.8%
ø 98
 
1.8%
Other values (13) 245
 
4.4%
Punctuation
ValueCountFrequency (%)
7
100.0%

identifiedByID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:14.808795image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length10
Mean length10
Min length7

Characters and Unicode

Total characters20
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st row31.1435
2nd rowPlagiorchiida
ValueCountFrequency (%)
31.1435 1
50.0%
plagiorchiida 1
50.0%
2025-01-14T11:48:14.921943image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 3
15.0%
3 2
 
10.0%
1 2
 
10.0%
a 2
 
10.0%
. 1
 
5.0%
4 1
 
5.0%
5 1
 
5.0%
P 1
 
5.0%
l 1
 
5.0%
g 1
 
5.0%
Other values (5) 5
25.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12
60.0%
Decimal Number 6
30.0%
Other Punctuation 1
 
5.0%
Uppercase Letter 1
 
5.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 3
25.0%
a 2
16.7%
l 1
 
8.3%
g 1
 
8.3%
o 1
 
8.3%
r 1
 
8.3%
c 1
 
8.3%
h 1
 
8.3%
d 1
 
8.3%
Decimal Number
ValueCountFrequency (%)
3 2
33.3%
1 2
33.3%
4 1
16.7%
5 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%
Uppercase Letter
ValueCountFrequency (%)
P 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 13
65.0%
Common 7
35.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 3
23.1%
a 2
15.4%
P 1
 
7.7%
l 1
 
7.7%
g 1
 
7.7%
o 1
 
7.7%
r 1
 
7.7%
c 1
 
7.7%
h 1
 
7.7%
d 1
 
7.7%
Common
ValueCountFrequency (%)
3 2
28.6%
1 2
28.6%
. 1
14.3%
4 1
14.3%
5 1
14.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 3
15.0%
3 2
 
10.0%
1 2
 
10.0%
a 2
 
10.0%
. 1
 
5.0%
4 1
 
5.0%
5 1
 
5.0%
P 1
 
5.0%
l 1
 
5.0%
g 1
 
5.0%
Other values (5) 5
25.0%

dateIdentified
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:14.969724image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters8
Distinct characters7
Distinct categories3 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row-83.7685
ValueCountFrequency (%)
83.7685 1
100.0%
2025-01-14T11:48:15.070758image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
8 2
25.0%
- 1
12.5%
3 1
12.5%
. 1
12.5%
7 1
12.5%
6 1
12.5%
5 1
12.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 6
75.0%
Dash Punctuation 1
 
12.5%
Other Punctuation 1
 
12.5%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
8 2
33.3%
3 1
16.7%
7 1
16.7%
6 1
16.7%
5 1
16.7%
Dash Punctuation
ValueCountFrequency (%)
- 1
100.0%
Other Punctuation
ValueCountFrequency (%)
. 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
8 2
25.0%
- 1
12.5%
3 1
12.5%
. 1
12.5%
7 1
12.5%
6 1
12.5%
5 1
12.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 8
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
8 2
25.0%
- 1
12.5%
3 1
12.5%
. 1
12.5%
7 1
12.5%
6 1
12.5%
5 1
12.5%
Distinct7
Distinct (%)77.8%
Missing1926052
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:15.131481image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length14
Median length13
Mean length9.777777778
Min length6

Characters and Unicode

Total characters88
Distinct characters24
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)55.6%

Sample

1st rowUnited States
2nd rowBrazil
3rd rowPuerto Rico
4th rowArgentina
5th rowUnited States
ValueCountFrequency (%)
united 2
15.4%
states 2
15.4%
brazil 2
15.4%
puerto 1
7.7%
rico 1
7.7%
argentina 1
7.7%
costa 1
7.7%
rica 1
7.7%
dicrocoeliidae 1
7.7%
panama 1
7.7%
2025-01-14T11:48:15.264873image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 11
12.5%
i 10
11.4%
t 9
 
10.2%
e 8
 
9.1%
o 5
 
5.7%
r 5
 
5.7%
n 5
 
5.7%
4
 
4.5%
c 4
 
4.5%
d 3
 
3.4%
Other values (14) 24
27.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 71
80.7%
Uppercase Letter 13
 
14.8%
Space Separator 4
 
4.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 11
15.5%
i 10
14.1%
t 9
12.7%
e 8
11.3%
o 5
7.0%
r 5
7.0%
n 5
7.0%
c 4
 
5.6%
d 3
 
4.2%
s 3
 
4.2%
Other values (5) 8
11.3%
Uppercase Letter
ValueCountFrequency (%)
U 2
15.4%
R 2
15.4%
P 2
15.4%
B 2
15.4%
S 2
15.4%
A 1
7.7%
C 1
7.7%
D 1
7.7%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 84
95.5%
Common 4
 
4.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 11
13.1%
i 10
11.9%
t 9
10.7%
e 8
 
9.5%
o 5
 
6.0%
r 5
 
6.0%
n 5
 
6.0%
c 4
 
4.8%
d 3
 
3.6%
s 3
 
3.6%
Other values (13) 21
25.0%
Common
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 88
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 11
12.5%
i 10
11.4%
t 9
 
10.2%
e 8
 
9.1%
o 5
 
5.7%
r 5
 
5.7%
n 5
 
5.7%
4
 
4.5%
c 4
 
4.5%
d 3
 
3.4%
Other values (14) 24
27.3%

identificationRemarks
Text

Missing 

Distinct5
Distinct (%)100.0%
Missing1926056
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:15.331631image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length9
Mean length10.8
Min length8

Characters and Unicode

Total characters54
Distinct characters23
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5 ?
Unique (%)100.0%

Sample

1st rowDistrict of Columbia
2nd rowAmazonas
3rd rowLouisiana
4th rowSao Paulo
5th rowSan Jose
ValueCountFrequency (%)
district 1
11.1%
of 1
11.1%
columbia 1
11.1%
amazonas 1
11.1%
louisiana 1
11.1%
sao 1
11.1%
paulo 1
11.1%
san 1
11.1%
jose 1
11.1%
2025-01-14T11:48:15.456112image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 8
14.8%
o 7
13.0%
i 5
 
9.3%
s 4
 
7.4%
4
 
7.4%
u 3
 
5.6%
n 3
 
5.6%
t 2
 
3.7%
S 2
 
3.7%
l 2
 
3.7%
Other values (13) 14
25.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 42
77.8%
Uppercase Letter 8
 
14.8%
Space Separator 4
 
7.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 8
19.0%
o 7
16.7%
i 5
11.9%
s 4
9.5%
u 3
 
7.1%
n 3
 
7.1%
t 2
 
4.8%
l 2
 
4.8%
m 2
 
4.8%
z 1
 
2.4%
Other values (5) 5
11.9%
Uppercase Letter
ValueCountFrequency (%)
S 2
25.0%
J 1
12.5%
P 1
12.5%
L 1
12.5%
D 1
12.5%
A 1
12.5%
C 1
12.5%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 50
92.6%
Common 4
 
7.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 8
16.0%
o 7
14.0%
i 5
10.0%
s 4
 
8.0%
u 3
 
6.0%
n 3
 
6.0%
t 2
 
4.0%
S 2
 
4.0%
l 2
 
4.0%
m 2
 
4.0%
Other values (12) 12
24.0%
Common
ValueCountFrequency (%)
4
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 54
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 8
14.8%
o 7
13.0%
i 5
 
9.3%
s 4
 
7.4%
4
 
7.4%
u 3
 
5.6%
n 3
 
5.6%
t 2
 
3.7%
S 2
 
3.7%
l 2
 
3.7%
Other values (13) 14
25.9%

scientificNameID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:15.506649image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters20
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowHemionchos
2nd rowConspicuum
ValueCountFrequency (%)
hemionchos 1
50.0%
conspicuum 1
50.0%
2025-01-14T11:48:15.611387image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 3
15.0%
m 2
10.0%
i 2
10.0%
n 2
10.0%
c 2
10.0%
s 2
10.0%
u 2
10.0%
H 1
 
5.0%
e 1
 
5.0%
h 1
 
5.0%
Other values (2) 2
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 18
90.0%
Uppercase Letter 2
 
10.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3
16.7%
m 2
11.1%
i 2
11.1%
n 2
11.1%
c 2
11.1%
s 2
11.1%
u 2
11.1%
e 1
 
5.6%
h 1
 
5.6%
p 1
 
5.6%
Uppercase Letter
ValueCountFrequency (%)
H 1
50.0%
C 1
50.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 20
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 3
15.0%
m 2
10.0%
i 2
10.0%
n 2
10.0%
c 2
10.0%
s 2
10.0%
u 2
10.0%
H 1
 
5.0%
e 1
 
5.0%
h 1
 
5.0%
Other values (2) 2
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 20
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 3
15.0%
m 2
10.0%
i 2
10.0%
n 2
10.0%
c 2
10.0%
s 2
10.0%
u 2
10.0%
H 1
 
5.0%
e 1
 
5.0%
h 1
 
5.0%
Other values (2) 2
10.0%

acceptedNameUsageID
Text

Missing 

Distinct8
Distinct (%)100.0%
Missing1926053
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:15.689298image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length56
Median length11.5
Mean length19.5
Min length4

Characters and Unicode

Total characters156
Distinct characters34
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8 ?
Unique (%)100.0%

Sample

1st rowWashington DC
2nd rowManaus, Rio Solimoes, Ilha Da Marchantaria
3rd rowPonce
4th rowAzul
5th rowRaceland
ValueCountFrequency (%)
washington 1
 
4.2%
dc 1
 
4.2%
carcoles 1
 
4.2%
off 1
 
4.2%
piracicaba 1
 
4.2%
frane 1
 
4.2%
camora 1
 
4.2%
from 1
 
4.2%
segment 1
 
4.2%
endeavour 1
 
4.2%
Other values (14) 14
58.3%
2025-01-14T11:48:15.825751image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 21
 
13.5%
16
 
10.3%
n 11
 
7.1%
e 10
 
6.4%
o 10
 
6.4%
i 8
 
5.1%
r 8
 
5.1%
c 7
 
4.5%
u 5
 
3.2%
l 5
 
3.2%
Other values (24) 55
35.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 112
71.8%
Uppercase Letter 24
 
15.4%
Space Separator 16
 
10.3%
Other Punctuation 4
 
2.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 21
18.8%
n 11
9.8%
e 10
8.9%
o 10
8.9%
i 8
 
7.1%
r 8
 
7.1%
c 7
 
6.2%
u 5
 
4.5%
l 5
 
4.5%
t 4
 
3.6%
Other values (9) 23
20.5%
Uppercase Letter
ValueCountFrequency (%)
C 3
12.5%
R 3
12.5%
P 3
12.5%
F 3
12.5%
S 2
8.3%
M 2
8.3%
D 2
8.3%
I 1
 
4.2%
A 1
 
4.2%
J 1
 
4.2%
Other values (3) 3
12.5%
Space Separator
ValueCountFrequency (%)
16
100.0%
Other Punctuation
ValueCountFrequency (%)
, 4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 136
87.2%
Common 20
 
12.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 21
15.4%
n 11
 
8.1%
e 10
 
7.4%
o 10
 
7.4%
i 8
 
5.9%
r 8
 
5.9%
c 7
 
5.1%
u 5
 
3.7%
l 5
 
3.7%
t 4
 
2.9%
Other values (22) 47
34.6%
Common
ValueCountFrequency (%)
16
80.0%
, 4
 
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 156
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 21
 
13.5%
16
 
10.3%
n 11
 
7.1%
e 10
 
6.4%
o 10
 
6.4%
i 8
 
5.1%
r 8
 
5.1%
c 7
 
4.5%
u 5
 
3.2%
l 5
 
3.2%
Other values (24) 55
35.3%

nameAccordingToID
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:15.880067image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length11
Median length9.5
Mean length9.5
Min length8

Characters and Unicode

Total characters19
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowstriatus
2nd rowicteridorum
ValueCountFrequency (%)
striatus 1
50.0%
icteridorum 1
50.0%
2025-01-14T11:48:15.988401image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 19
100.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 19
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t 3
15.8%
r 3
15.8%
i 3
15.8%
s 2
10.5%
u 2
10.5%
a 1
 
5.3%
c 1
 
5.3%
e 1
 
5.3%
d 1
 
5.3%
o 1
 
5.3%

scientificName
Text

Missing 

Distinct133983
Distinct (%)8.5%
Missing353701
Missing (%)18.4%
Memory size14.7 MiB
2025-01-14T11:48:16.201438image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length85
Median length59
Mean length19.4468843
Min length4

Characters and Unicode

Total characters30577503
Distinct characters78
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique51616 ?
Unique (%)3.3%

Sample

1st rowScypha sp.
2nd rowBulla striata
3rd rowStylopathes columnaris
4th rowOphiothrix suensonii
5th rowCypraea labrolineata
ValueCountFrequency (%)
sp 198030
 
6.0%
conus 24321
 
0.7%
cypraea 15393
 
0.5%
cambarus 12002
 
0.4%
cerithium 9394
 
0.3%
orconectes 8683
 
0.3%
procambarus 8139
 
0.2%
nassarius 6728
 
0.2%
gracilis 6630
 
0.2%
terebra 5167
 
0.2%
Other values (70823) 3024717
91.1%
2025-01-14T11:48:16.506753image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3609833
 
11.8%
i 2749960
 
9.0%
s 2277139
 
7.4%
e 1954011
 
6.4%
r 1901028
 
6.2%
o 1840277
 
6.0%
1746844
 
5.7%
l 1713984
 
5.6%
n 1541454
 
5.0%
t 1536972
 
5.0%
Other values (68) 9706001
31.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 26720492
87.4%
Space Separator 1746844
 
5.7%
Uppercase Letter 1685116
 
5.5%
Other Punctuation 198760
 
0.7%
Open Punctuation 112845
 
0.4%
Close Punctuation 112845
 
0.4%
Decimal Number 473
 
< 0.1%
Dash Punctuation 110
 
< 0.1%
Math Symbol 18
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3609833
13.5%
i 2749960
10.3%
s 2277139
 
8.5%
e 1954011
 
7.3%
r 1901028
 
7.1%
o 1840277
 
6.9%
l 1713984
 
6.4%
n 1541454
 
5.8%
t 1536972
 
5.8%
u 1522029
 
5.7%
Other values (18) 6073805
22.7%
Uppercase Letter
ValueCountFrequency (%)
C 246678
14.6%
P 236724
14.0%
A 165911
9.8%
S 135225
 
8.0%
M 109797
 
6.5%
T 106733
 
6.3%
L 97688
 
5.8%
E 85762
 
5.1%
O 78979
 
4.7%
N 66249
 
3.9%
Other values (16) 355370
21.1%
Decimal Number
ValueCountFrequency (%)
1 156
33.0%
8 111
23.5%
4 58
 
12.3%
9 38
 
8.0%
6 27
 
5.7%
2 27
 
5.7%
5 19
 
4.0%
7 16
 
3.4%
0 14
 
3.0%
3 7
 
1.5%
Other Punctuation
ValueCountFrequency (%)
. 198543
99.9%
, 107
 
0.1%
" 60
 
< 0.1%
/ 29
 
< 0.1%
' 15
 
< 0.1%
& 3
 
< 0.1%
? 3
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 112844
> 99.9%
[ 1
 
< 0.1%
Close Punctuation
ValueCountFrequency (%)
) 112844
> 99.9%
] 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
1746844
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 110
100.0%
Math Symbol
ValueCountFrequency (%)
+ 18
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 28405608
92.9%
Common 2171895
 
7.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3609833
12.7%
i 2749960
 
9.7%
s 2277139
 
8.0%
e 1954011
 
6.9%
r 1901028
 
6.7%
o 1840277
 
6.5%
l 1713984
 
6.0%
n 1541454
 
5.4%
t 1536972
 
5.4%
u 1522029
 
5.4%
Other values (44) 7758921
27.3%
Common
ValueCountFrequency (%)
1746844
80.4%
. 198543
 
9.1%
( 112844
 
5.2%
) 112844
 
5.2%
1 156
 
< 0.1%
8 111
 
< 0.1%
- 110
 
< 0.1%
, 107
 
< 0.1%
" 60
 
< 0.1%
4 58
 
< 0.1%
Other values (14) 218
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30577487
> 99.9%
None 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3609833
 
11.8%
i 2749960
 
9.0%
s 2277139
 
7.4%
e 1954011
 
6.4%
r 1901028
 
6.2%
o 1840277
 
6.0%
1746844
 
5.7%
l 1713984
 
5.6%
n 1541454
 
5.0%
t 1536972
 
5.0%
Other values (66) 9705985
31.7%
None
ValueCountFrequency (%)
ü 15
93.8%
æ 1
 
6.2%

parentNameUsage
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:16.568085image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length16.5
Mean length16.5
Min length13

Characters and Unicode

Total characters33
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowCampbell & Beveridge
2nd rowDenton & Byrd
ValueCountFrequency (%)
2
33.3%
campbell 1
16.7%
beveridge 1
16.7%
denton 1
16.7%
byrd 1
16.7%
2025-01-14T11:48:16.678681image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 5
15.2%
4
 
12.1%
d 2
 
6.1%
n 2
 
6.1%
l 2
 
6.1%
& 2
 
6.1%
B 2
 
6.1%
r 2
 
6.1%
C 1
 
3.0%
o 1
 
3.0%
Other values (10) 10
30.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 23
69.7%
Space Separator 4
 
12.1%
Uppercase Letter 4
 
12.1%
Other Punctuation 2
 
6.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 5
21.7%
d 2
 
8.7%
n 2
 
8.7%
l 2
 
8.7%
r 2
 
8.7%
o 1
 
4.3%
t 1
 
4.3%
g 1
 
4.3%
v 1
 
4.3%
i 1
 
4.3%
Other values (5) 5
21.7%
Uppercase Letter
ValueCountFrequency (%)
B 2
50.0%
C 1
25.0%
D 1
25.0%
Space Separator
ValueCountFrequency (%)
4
100.0%
Other Punctuation
ValueCountFrequency (%)
& 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27
81.8%
Common 6
 
18.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 5
18.5%
d 2
 
7.4%
n 2
 
7.4%
l 2
 
7.4%
B 2
 
7.4%
r 2
 
7.4%
C 1
 
3.7%
o 1
 
3.7%
t 1
 
3.7%
D 1
 
3.7%
Other values (8) 8
29.6%
Common
ValueCountFrequency (%)
4
66.7%
& 2
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 33
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 5
15.2%
4
 
12.1%
d 2
 
6.1%
n 2
 
6.1%
l 2
 
6.1%
& 2
 
6.1%
B 2
 
6.1%
r 2
 
6.1%
C 1
 
3.0%
o 1
 
3.0%
Other values (10) 10
30.3%

originalNameUsage
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:16.722557image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowGEOLocate
ValueCountFrequency (%)
geolocate 1
100.0%
2025-01-14T11:48:16.819149image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 1
11.1%
E 1
11.1%
O 1
11.1%
L 1
11.1%
o 1
11.1%
c 1
11.1%
a 1
11.1%
t 1
11.1%
e 1
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
55.6%
Uppercase Letter 4
44.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 1
20.0%
c 1
20.0%
a 1
20.0%
t 1
20.0%
e 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
G 1
25.0%
E 1
25.0%
O 1
25.0%
L 1
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 1
11.1%
E 1
11.1%
O 1
11.1%
L 1
11.1%
o 1
11.1%
c 1
11.1%
a 1
11.1%
t 1
11.1%
e 1
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 1
11.1%
E 1
11.1%
O 1
11.1%
L 1
11.1%
o 1
11.1%
c 1
11.1%
a 1
11.1%
t 1
11.1%
e 1
11.1%
Distinct4360
Distinct (%)0.2%
Missing474
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-14T11:48:16.966617image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length134
Median length117
Mean length62.96713054
Min length5

Characters and Unicode

Total characters121248688
Distinct characters70
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique592 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia, Porifera, Calcarea
2nd rowAnimalia, Mollusca, Gastropoda, Bullidae
3rd rowAnimalia, Cnidaria, Anthozoa, Hexacorallia, Antipatharia, Stylopathidae
4th rowAnimalia, Echinodermata, Ophiuroidea, Ophiurida, Ophiotrichidae
5th rowAnimalia, Mollusca, Gastropoda, Cypraeidae
ValueCountFrequency (%)
animalia 1921701
 
18.1%
mollusca 866254
 
8.1%
gastropoda 612643
 
5.8%
arthropoda 390685
 
3.7%
crustacea 385047
 
3.6%
malacostraca 301920
 
2.8%
eumalacostraca 294842
 
2.8%
annelida 241745
 
2.3%
polychaeta 212926
 
2.0%
bivalvia 207657
 
2.0%
Other values (4348) 5201888
48.9%
2025-01-14T11:48:17.206436image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 19357102
16.0%
i 10627524
 
8.8%
8711721
 
7.2%
, 8690181
 
7.2%
o 7922403
 
6.5%
l 7524872
 
6.2%
e 6161710
 
5.1%
d 5674220
 
4.7%
r 5611641
 
4.6%
c 5022856
 
4.1%
Other values (60) 35944458
29.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 93230361
76.9%
Uppercase Letter 10615593
 
8.8%
Space Separator 8711721
 
7.2%
Other Punctuation 8690229
 
7.2%
Dash Punctuation 285
 
< 0.1%
Open Punctuation 169
 
< 0.1%
Close Punctuation 169
 
< 0.1%
Connector Punctuation 126
 
< 0.1%
Decimal Number 32
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 19357102
20.8%
i 10627524
11.4%
o 7922403
8.5%
l 7524872
 
8.1%
e 6161710
 
6.6%
d 5674220
 
6.1%
r 5611641
 
6.0%
c 5022856
 
5.4%
n 4723054
 
5.1%
t 4392846
 
4.7%
Other values (16) 16212133
17.4%
Uppercase Letter
ValueCountFrequency (%)
A 2993156
28.2%
M 1365520
12.9%
C 1144907
 
10.8%
P 1045845
 
9.9%
E 845954
 
8.0%
G 714563
 
6.7%
S 488541
 
4.6%
D 334992
 
3.2%
B 296918
 
2.8%
T 261542
 
2.5%
Other values (15) 1123655
 
10.6%
Decimal Number
ValueCountFrequency (%)
7 7
21.9%
2 5
15.6%
9 5
15.6%
5 3
9.4%
3 3
9.4%
8 3
9.4%
1 2
 
6.2%
4 2
 
6.2%
0 1
 
3.1%
6 1
 
3.1%
Other Punctuation
ValueCountFrequency (%)
, 8690181
> 99.9%
. 34
 
< 0.1%
? 14
 
< 0.1%
Space Separator
ValueCountFrequency (%)
8711721
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 285
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 169
100.0%
Close Punctuation
ValueCountFrequency (%)
] 169
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 126
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 103845954
85.6%
Common 17402734
 
14.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 19357102
18.6%
i 10627524
 
10.2%
o 7922403
 
7.6%
l 7524872
 
7.2%
e 6161710
 
5.9%
d 5674220
 
5.5%
r 5611641
 
5.4%
c 5022856
 
4.8%
n 4723054
 
4.5%
t 4392846
 
4.2%
Other values (41) 26827726
25.8%
Common
ValueCountFrequency (%)
8711721
50.1%
, 8690181
49.9%
- 285
 
< 0.1%
[ 169
 
< 0.1%
] 169
 
< 0.1%
_ 126
 
< 0.1%
. 34
 
< 0.1%
? 14
 
< 0.1%
7 7
 
< 0.1%
2 5
 
< 0.1%
Other values (9) 23
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 121248688
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 19357102
16.0%
i 10627524
 
8.8%
8711721
 
7.2%
, 8690181
 
7.2%
o 7922403
 
6.5%
l 7524872
 
6.2%
e 6161710
 
5.1%
d 5674220
 
4.7%
r 5611641
 
4.6%
c 5022856
 
4.1%
Other values (60) 35944458
29.6%
Distinct13
Distinct (%)< 0.1%
Missing2074
Missing (%)0.1%
Memory size14.7 MiB
2025-01-14T11:48:17.272494image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length8
Mean length8.00002079
Min length7

Characters and Unicode

Total characters15391936
Distinct characters33
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7 ?
Unique (%)< 0.1%

Sample

1st rowAnimalia
2nd rowAnimalia
3rd rowAnimalia
4th rowAnimalia
5th rowAnimalia
ValueCountFrequency (%)
animalia 1921701
99.9%
protozoa 2154
 
0.1%
protista 55
 
< 0.1%
chromista 36
 
< 0.1%
bacteria 28
 
< 0.1%
eukaryota 6
 
< 0.1%
eukarya 1
 
< 0.1%
77.0364 1
 
< 0.1%
59.9317 1
 
< 0.1%
59.8585 1
 
< 0.1%
Other values (3) 3
 
< 0.1%
2025-01-14T11:48:17.389909image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 3845717
25.0%
i 3843521
25.0%
m 1921737
12.5%
A 1921701
12.5%
l 1921701
12.5%
n 1921701
12.5%
o 6559
 
< 0.1%
t 2334
 
< 0.1%
r 2280
 
< 0.1%
P 2209
 
< 0.1%
Other values (23) 2476
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13467908
87.5%
Uppercase Letter 1923981
 
12.5%
Decimal Number 35
 
< 0.1%
Dash Punctuation 6
 
< 0.1%
Other Punctuation 6
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 3845717
28.6%
i 3843521
28.5%
m 1921737
14.3%
l 1921701
14.3%
n 1921701
14.3%
o 6559
 
< 0.1%
t 2334
 
< 0.1%
r 2280
 
< 0.1%
z 2154
 
< 0.1%
s 91
 
< 0.1%
Other values (6) 113
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
9 8
22.9%
5 5
14.3%
0 5
14.3%
7 5
14.3%
8 3
 
8.6%
6 2
 
5.7%
4 2
 
5.7%
3 2
 
5.7%
1 2
 
5.7%
2 1
 
2.9%
Uppercase Letter
ValueCountFrequency (%)
A 1921701
99.9%
P 2209
 
0.1%
C 36
 
< 0.1%
B 28
 
< 0.1%
E 7
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 6
100.0%
Other Punctuation
ValueCountFrequency (%)
. 6
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 15391889
> 99.9%
Common 47
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 3845717
25.0%
i 3843521
25.0%
m 1921737
12.5%
A 1921701
12.5%
l 1921701
12.5%
n 1921701
12.5%
o 6559
 
< 0.1%
t 2334
 
< 0.1%
r 2280
 
< 0.1%
P 2209
 
< 0.1%
Other values (11) 2429
 
< 0.1%
Common
ValueCountFrequency (%)
9 8
17.0%
- 6
12.8%
. 6
12.8%
5 5
10.6%
0 5
10.6%
7 5
10.6%
8 3
 
6.4%
6 2
 
4.3%
4 2
 
4.3%
3 2
 
4.3%
Other values (2) 3
 
6.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 15391936
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 3845717
25.0%
i 3843521
25.0%
m 1921737
12.5%
A 1921701
12.5%
l 1921701
12.5%
n 1921701
12.5%
o 6559
 
< 0.1%
t 2334
 
< 0.1%
r 2280
 
< 0.1%
P 2209
 
< 0.1%
Other values (23) 2476
 
< 0.1%

phylum
Text

Distinct84
Distinct (%)< 0.1%
Missing525
Missing (%)< 0.1%
Memory size14.7 MiB
2025-01-14T11:48:17.465765image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length8
Mean length8.859807347
Min length5

Characters and Unicode

Total characters17059878
Distinct characters50
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11 ?
Unique (%)< 0.1%

Sample

1st rowPorifera
2nd rowMollusca
3rd rowCnidaria
4th rowEchinodermata
5th rowMollusca
ValueCountFrequency (%)
mollusca 866254
45.0%
arthropoda 390685
20.3%
annelida 241588
 
12.5%
cnidaria 117378
 
6.1%
echinodermata 91192
 
4.7%
nematoda 68776
 
3.6%
platyhelminthes 46010
 
2.4%
porifera 32720
 
1.7%
chordata 19744
 
1.0%
sipuncula 10414
 
0.5%
Other values (84) 42151
 
2.2%
2025-01-14T11:48:17.609794image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 2241799
13.1%
l 2085545
12.2%
o 1908402
11.2%
r 1107910
 
6.5%
c 989189
 
5.8%
d 934105
 
5.5%
s 915271
 
5.4%
u 887926
 
5.2%
M 867048
 
5.1%
n 770924
 
4.5%
Other values (40) 4351759
25.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15132549
88.7%
Uppercase Letter 1925540
 
11.3%
Space Separator 1376
 
< 0.1%
Dash Punctuation 283
 
< 0.1%
Connector Punctuation 126
 
< 0.1%
Decimal Number 4
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 2241799
14.8%
l 2085545
13.8%
o 1908402
12.6%
r 1107910
7.3%
c 989189
 
6.5%
d 934105
 
6.2%
s 915271
 
6.0%
u 887926
 
5.9%
n 770924
 
5.1%
t 685639
 
4.5%
Other values (14) 2605839
17.2%
Uppercase Letter
ValueCountFrequency (%)
M 867048
45.0%
A 638618
33.2%
C 140083
 
7.3%
E 91424
 
4.7%
P 80493
 
4.2%
N 75155
 
3.9%
S 11217
 
0.6%
B 10349
 
0.5%
K 6388
 
0.3%
H 2153
 
0.1%
Other values (11) 2612
 
0.1%
Decimal Number
ValueCountFrequency (%)
8 2
50.0%
4 2
50.0%
Space Separator
ValueCountFrequency (%)
1376
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 283
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 126
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17058089
> 99.9%
Common 1789
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 2241799
13.1%
l 2085545
12.2%
o 1908402
11.2%
r 1107910
 
6.5%
c 989189
 
5.8%
d 934105
 
5.5%
s 915271
 
5.4%
u 887926
 
5.2%
M 867048
 
5.1%
n 770924
 
4.5%
Other values (35) 4349970
25.5%
Common
ValueCountFrequency (%)
1376
76.9%
- 283
 
15.8%
_ 126
 
7.0%
8 2
 
0.1%
4 2
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 17059878
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 2241799
13.1%
l 2085545
12.2%
o 1908402
11.2%
r 1107910
 
6.5%
c 989189
 
5.8%
d 934105
 
5.5%
s 915271
 
5.4%
u 887926
 
5.2%
M 867048
 
5.1%
n 770924
 
4.5%
Other values (40) 4351759
25.5%

class
Text

Missing 

Distinct140
Distinct (%)< 0.1%
Missing76135
Missing (%)4.0%
Memory size14.7 MiB
2025-01-14T11:48:17.701853image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length20
Median length19
Mean length10.09882287
Min length4

Characters and Unicode

Total characters18682075
Distinct characters43
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique18 ?
Unique (%)< 0.1%

Sample

1st rowCalcarea
2nd rowGastropoda
3rd rowAnthozoa
4th rowOphiuroidea
5th rowGastropoda
ValueCountFrequency (%)
gastropoda 612643
33.1%
malacostraca 301920
16.3%
polychaeta 210885
 
11.4%
bivalvia 207657
 
11.2%
anthozoa 93047
 
5.0%
maxillopoda 54367
 
2.9%
chromadorea 34765
 
1.9%
ophiuroidea 27083
 
1.5%
asteroidea 25627
 
1.4%
oligochaeta 25284
 
1.4%
Other values (130) 256648
13.9%
2025-01-14T11:48:17.874797image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4039714
21.6%
o 2550884
13.7%
t 1353417
 
7.2%
r 1171215
 
6.3%
s 1019474
 
5.5%
c 959574
 
5.1%
d 951027
 
5.1%
l 934032
 
5.0%
p 820053
 
4.4%
i 730905
 
3.9%
Other values (33) 4151780
22.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16832149
90.1%
Uppercase Letter 1849926
 
9.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4039714
24.0%
o 2550884
15.2%
t 1353417
 
8.0%
r 1171215
 
7.0%
s 1019474
 
6.1%
c 959574
 
5.7%
d 951027
 
5.7%
l 934032
 
5.5%
p 820053
 
4.9%
i 730905
 
4.3%
Other values (14) 2301854
13.7%
Uppercase Letter
ValueCountFrequency (%)
G 612731
33.1%
M 363412
19.6%
P 227238
 
12.3%
B 211201
 
11.4%
A 153646
 
8.3%
C 79724
 
4.3%
O 75877
 
4.1%
H 41146
 
2.2%
T 27301
 
1.5%
E 23522
 
1.3%
Other values (9) 34128
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Latin 18682075
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4039714
21.6%
o 2550884
13.7%
t 1353417
 
7.2%
r 1171215
 
6.3%
s 1019474
 
5.5%
c 959574
 
5.1%
d 951027
 
5.1%
l 934032
 
5.0%
p 820053
 
4.4%
i 730905
 
3.9%
Other values (33) 4151780
22.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18682075
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4039714
21.6%
o 2550884
13.7%
t 1353417
 
7.2%
r 1171215
 
6.3%
s 1019474
 
5.5%
c 959574
 
5.1%
d 951027
 
5.1%
l 934032
 
5.0%
p 820053
 
4.4%
i 730905
 
3.9%
Other values (33) 4151780
22.2%

order
Text

Missing 

Distinct464
Distinct (%)< 0.1%
Missing940799
Missing (%)48.8%
Memory size14.7 MiB
2025-01-14T11:48:18.037644image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length28
Median length21
Mean length10.1311032
Min length5

Characters and Unicode

Total characters9981791
Distinct characters50
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique50 ?
Unique (%)< 0.1%

Sample

1st rowAntipatharia
2nd rowOphiurida
3rd rowForcipulatida
4th rowForcipulatida
5th rowDecapoda
ValueCountFrequency (%)
decapoda 196699
20.0%
phyllodocida 69303
 
7.0%
scleractinia 54206
 
5.5%
amphipoda 49518
 
5.0%
isopoda 28998
 
2.9%
terebellida 28660
 
2.9%
unionoida 28558
 
2.9%
eunicida 25633
 
2.6%
ophiurida 22910
 
2.3%
calanoida 21058
 
2.1%
Other values (456) 459889
46.7%
2025-01-14T11:48:18.263474image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1613568
16.2%
o 1058436
10.6%
i 1000708
10.0%
d 988403
9.9%
c 644706
 
6.5%
e 615446
 
6.2%
p 533792
 
5.3%
l 509511
 
5.1%
n 359765
 
3.6%
r 349430
 
3.5%
Other values (40) 2308026
23.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8996191
90.1%
Uppercase Letter 985095
 
9.9%
Space Separator 170
 
< 0.1%
Open Punctuation 167
 
< 0.1%
Close Punctuation 167
 
< 0.1%
Other Punctuation 1
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1613568
17.9%
o 1058436
11.8%
i 1000708
11.1%
d 988403
11.0%
c 644706
 
7.2%
e 615446
 
6.8%
p 533792
 
5.9%
l 509511
 
5.7%
n 359765
 
4.0%
r 349430
 
3.9%
Other values (14) 1322426
14.7%
Uppercase Letter
ValueCountFrequency (%)
D 209529
21.3%
S 132542
13.5%
P 130890
13.3%
A 92015
9.3%
C 72395
 
7.3%
T 63419
 
6.4%
E 57025
 
5.8%
O 31777
 
3.2%
I 29448
 
3.0%
U 28573
 
2.9%
Other values (12) 137482
14.0%
Space Separator
ValueCountFrequency (%)
170
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 167
100.0%
Close Punctuation
ValueCountFrequency (%)
] 167
100.0%
Other Punctuation
ValueCountFrequency (%)
? 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9981286
> 99.9%
Common 505
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1613568
16.2%
o 1058436
10.6%
i 1000708
10.0%
d 988403
9.9%
c 644706
 
6.5%
e 615446
 
6.2%
p 533792
 
5.3%
l 509511
 
5.1%
n 359765
 
3.6%
r 349430
 
3.5%
Other values (36) 2307521
23.1%
Common
ValueCountFrequency (%)
170
33.7%
[ 167
33.1%
] 167
33.1%
? 1
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9981791
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1613568
16.2%
o 1058436
10.6%
i 1000708
10.0%
d 988403
9.9%
c 644706
 
6.5%
e 615446
 
6.2%
p 533792
 
5.3%
l 509511
 
5.1%
n 359765
 
3.6%
r 349430
 
3.5%
Other values (40) 2308026
23.1%

family
Text

Missing 

Distinct3009
Distinct (%)0.2%
Missing191835
Missing (%)10.0%
Memory size14.7 MiB
2025-01-14T11:48:18.433535image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length23
Mean length11.08729197
Min length6

Characters and Unicode

Total characters19227870
Distinct characters55
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique298 ?
Unique (%)< 0.1%

Sample

1st rowBullidae
2nd rowStylopathidae
3rd rowOphiotrichidae
4th rowCypraeidae
5th rowAsteriidae
ValueCountFrequency (%)
conidae 38810
 
2.2%
cambaridae 29321
 
1.7%
unionidae 26838
 
1.5%
veneridae 17888
 
1.0%
trochidae 16919
 
1.0%
cerithiidae 16894
 
1.0%
cypraeidae 16831
 
1.0%
spionidae 15844
 
0.9%
buccinidae 15338
 
0.9%
syllidae 14112
 
0.8%
Other values (2998) 1525570
88.0%
2025-01-14T11:48:18.835965image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
i 2890115
15.0%
a 2634149
13.7%
e 2542492
13.2%
d 1977044
10.3%
r 977057
 
5.1%
l 957443
 
5.0%
o 947839
 
4.9%
n 841051
 
4.4%
t 631565
 
3.3%
c 540802
 
2.8%
Other values (45) 4288313
22.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 17493461
91.0%
Uppercase Letter 1734226
 
9.0%
Space Separator 139
 
< 0.1%
Other Punctuation 41
 
< 0.1%
Math Symbol 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 2890115
16.5%
a 2634149
15.1%
e 2542492
14.5%
d 1977044
11.3%
r 977057
 
5.6%
l 957443
 
5.5%
o 947839
 
5.4%
n 841051
 
4.8%
t 631565
 
3.6%
c 540802
 
3.1%
Other values (16) 2553904
14.6%
Uppercase Letter
ValueCountFrequency (%)
C 285220
16.4%
P 255936
14.8%
S 149154
8.6%
A 140208
 
8.1%
T 138389
 
8.0%
M 92451
 
5.3%
O 88221
 
5.1%
L 80443
 
4.6%
H 73149
 
4.2%
N 66069
 
3.8%
Other values (15) 364986
21.0%
Other Punctuation
ValueCountFrequency (%)
. 28
68.3%
? 13
31.7%
Space Separator
ValueCountFrequency (%)
139
100.0%
Math Symbol
ValueCountFrequency (%)
+ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 19227687
> 99.9%
Common 183
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 2890115
15.0%
a 2634149
13.7%
e 2542492
13.2%
d 1977044
10.3%
r 977057
 
5.1%
l 957443
 
5.0%
o 947839
 
4.9%
n 841051
 
4.4%
t 631565
 
3.3%
c 540802
 
2.8%
Other values (41) 4288130
22.3%
Common
ValueCountFrequency (%)
139
76.0%
. 28
 
15.3%
? 13
 
7.1%
+ 3
 
1.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 19227870
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
i 2890115
15.0%
a 2634149
13.7%
e 2542492
13.2%
d 1977044
10.3%
r 977057
 
5.1%
l 957443
 
5.0%
o 947839
 
4.9%
n 841051
 
4.4%
t 631565
 
3.3%
c 540802
 
2.8%
Other values (45) 4288313
22.3%

subfamily
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:18.895501image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row47 57 0 N
ValueCountFrequency (%)
47 1
25.0%
57 1
25.0%
0 1
25.0%
n 1
25.0%
2025-01-14T11:48:18.994491image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
33.3%
7 2
22.2%
4 1
 
11.1%
5 1
 
11.1%
0 1
 
11.1%
N 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5
55.6%
Space Separator 3
33.3%
Uppercase Letter 1
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
7 2
40.0%
4 1
20.0%
5 1
20.0%
0 1
20.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Uppercase Letter
ValueCountFrequency (%)
N 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
88.9%
Latin 1
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3
37.5%
7 2
25.0%
4 1
 
12.5%
5 1
 
12.5%
0 1
 
12.5%
Latin
ValueCountFrequency (%)
N 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3
33.3%
7 2
22.2%
4 1
 
11.1%
5 1
 
11.1%
0 1
 
11.1%
N 1
 
11.1%

tribe
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:19.045516image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters9
Distinct characters7
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st row129 4 0 W
ValueCountFrequency (%)
129 1
25.0%
4 1
25.0%
0 1
25.0%
w 1
25.0%
2025-01-14T11:48:19.149059image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
3
33.3%
1 1
 
11.1%
2 1
 
11.1%
9 1
 
11.1%
4 1
 
11.1%
0 1
 
11.1%
W 1
 
11.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 5
55.6%
Space Separator 3
33.3%
Uppercase Letter 1
 
11.1%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 1
20.0%
2 1
20.0%
9 1
20.0%
4 1
20.0%
0 1
20.0%
Space Separator
ValueCountFrequency (%)
3
100.0%
Uppercase Letter
ValueCountFrequency (%)
W 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 8
88.9%
Latin 1
 
11.1%

Most frequent character per script

Common
ValueCountFrequency (%)
3
37.5%
1 1
 
12.5%
2 1
 
12.5%
9 1
 
12.5%
4 1
 
12.5%
0 1
 
12.5%
Latin
ValueCountFrequency (%)
W 1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 9
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
3
33.3%
1 1
 
11.1%
2 1
 
11.1%
9 1
 
11.1%
4 1
 
11.1%
0 1
 
11.1%
W 1
 
11.1%

subtribe
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:19.195528image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters13
Distinct characters11
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowSeurat, L. G.
ValueCountFrequency (%)
seurat 1
33.3%
l 1
33.3%
g 1
33.3%
2025-01-14T11:48:19.300572image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2
15.4%
. 2
15.4%
S 1
7.7%
e 1
7.7%
u 1
7.7%
r 1
7.7%
a 1
7.7%
t 1
7.7%
, 1
7.7%
L 1
7.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 5
38.5%
Other Punctuation 3
23.1%
Uppercase Letter 3
23.1%
Space Separator 2
 
15.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 1
20.0%
u 1
20.0%
r 1
20.0%
a 1
20.0%
t 1
20.0%
Uppercase Letter
ValueCountFrequency (%)
S 1
33.3%
L 1
33.3%
G 1
33.3%
Other Punctuation
ValueCountFrequency (%)
. 2
66.7%
, 1
33.3%
Space Separator
ValueCountFrequency (%)
2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8
61.5%
Common 5
38.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
S 1
12.5%
e 1
12.5%
u 1
12.5%
r 1
12.5%
a 1
12.5%
t 1
12.5%
L 1
12.5%
G 1
12.5%
Common
ValueCountFrequency (%)
2
40.0%
. 2
40.0%
, 1
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 13
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2
15.4%
. 2
15.4%
S 1
7.7%
e 1
7.7%
u 1
7.7%
r 1
7.7%
a 1
7.7%
t 1
7.7%
, 1
7.7%
L 1
7.7%

genus
Text

Missing 

Distinct21650
Distinct (%)1.4%
Missing353878
Missing (%)18.4%
Memory size14.7 MiB
2025-01-14T11:48:19.513079image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length27
Median length23
Mean length9.304575867
Min length2

Characters and Unicode

Total characters14628496
Distinct characters56
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4273 ?
Unique (%)0.3%

Sample

1st rowScypha
2nd rowBulla
3rd rowStylopathes
4th rowOphiothrix
5th rowCypraea
ValueCountFrequency (%)
conus 24245
 
1.5%
cypraea 15393
 
1.0%
cambarus 10444
 
0.7%
cerithium 9393
 
0.6%
orconectes 8665
 
0.6%
procambarus 8127
 
0.5%
nassarius 6727
 
0.4%
lumbrineris 4966
 
0.3%
terebra 4965
 
0.3%
aricidea 4582
 
0.3%
Other values (21641) 1474698
93.8%
2025-01-14T11:48:19.803126image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1747537
 
11.9%
i 1265907
 
8.7%
o 1157768
 
7.9%
e 1018454
 
7.0%
r 970134
 
6.6%
s 940035
 
6.4%
l 916555
 
6.3%
t 707468
 
4.8%
n 704501
 
4.8%
u 688774
 
4.7%
Other values (46) 4511363
30.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 13056275
89.3%
Uppercase Letter 1572183
 
10.7%
Space Separator 22
 
< 0.1%
Other Punctuation 11
 
< 0.1%
Dash Punctuation 5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1747537
13.4%
i 1265907
9.7%
o 1157768
 
8.9%
e 1018454
 
7.8%
r 970134
 
7.4%
s 940035
 
7.2%
l 916555
 
7.0%
t 707468
 
5.4%
n 704501
 
5.4%
u 688774
 
5.3%
Other values (16) 2939142
22.5%
Uppercase Letter
ValueCountFrequency (%)
C 229739
14.6%
P 219867
14.0%
A 155473
9.9%
S 126285
 
8.0%
M 103197
 
6.6%
T 96867
 
6.2%
L 91233
 
5.8%
E 82418
 
5.2%
O 74602
 
4.7%
N 62734
 
4.0%
Other values (16) 329768
21.0%
Other Punctuation
ValueCountFrequency (%)
. 10
90.9%
/ 1
 
9.1%
Space Separator
ValueCountFrequency (%)
22
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 14628458
> 99.9%
Common 38
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1747537
 
11.9%
i 1265907
 
8.7%
o 1157768
 
7.9%
e 1018454
 
7.0%
r 970134
 
6.6%
s 940035
 
6.4%
l 916555
 
6.3%
t 707468
 
4.8%
n 704501
 
4.8%
u 688774
 
4.7%
Other values (42) 4511325
30.8%
Common
ValueCountFrequency (%)
22
57.9%
. 10
26.3%
- 5
 
13.2%
/ 1
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII 14628496
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1747537
 
11.9%
i 1265907
 
8.7%
o 1157768
 
7.9%
e 1018454
 
7.0%
r 970134
 
6.6%
s 940035
 
6.4%
l 916555
 
6.3%
t 707468
 
4.8%
n 704501
 
4.8%
u 688774
 
4.7%
Other values (46) 4511363
30.8%

subgenus
Text

Missing 

Distinct2864
Distinct (%)2.5%
Missing1813329
Missing (%)94.1%
Memory size14.7 MiB
2025-01-14T11:48:19.989583image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length19
Median length16
Mean length10.25069191
Min length3

Characters and Unicode

Total characters1155581
Distinct characters52
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique738 ?
Unique (%)0.7%

Sample

1st rowOrtmannicus
2nd rowTorquis
3rd rowScolelepis
4th rowCaryophyllia
5th rowPitarenus
ValueCountFrequency (%)
thericium 3470
 
3.1%
depressicambarus 2960
 
2.6%
ortmannicus 2586
 
2.3%
stephanoconus 2431
 
2.2%
cambarus 1558
 
1.4%
canarium 1428
 
1.3%
nebularia 1392
 
1.2%
costellaria 1392
 
1.2%
strigatella 1335
 
1.2%
pennides 1328
 
1.2%
Other values (2854) 92852
82.4%
2025-01-14T11:48:20.244608image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 138855
12.0%
i 104726
 
9.1%
o 86546
 
7.5%
r 84920
 
7.3%
s 80324
 
7.0%
l 69319
 
6.0%
e 68847
 
6.0%
u 66055
 
5.7%
n 64717
 
5.6%
t 53444
 
4.6%
Other values (42) 337828
29.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1042849
90.2%
Uppercase Letter 112732
 
9.8%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 138855
13.3%
i 104726
10.0%
o 86546
 
8.3%
r 84920
 
8.1%
s 80324
 
7.7%
l 69319
 
6.6%
e 68847
 
6.6%
u 66055
 
6.3%
n 64717
 
6.2%
t 53444
 
5.1%
Other values (16) 225096
21.6%
Uppercase Letter
ValueCountFrequency (%)
C 16928
15.0%
P 16842
14.9%
A 10435
9.3%
T 9865
8.8%
S 8909
 
7.9%
M 6567
 
5.8%
L 6453
 
5.7%
D 5642
 
5.0%
O 4368
 
3.9%
N 3515
 
3.1%
Other values (16) 23208
20.6%

Most occurring scripts

ValueCountFrequency (%)
Latin 1155581
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 138855
12.0%
i 104726
 
9.1%
o 86546
 
7.5%
r 84920
 
7.3%
s 80324
 
7.0%
l 69319
 
6.0%
e 68847
 
6.0%
u 66055
 
5.7%
n 64717
 
5.6%
t 53444
 
4.6%
Other values (42) 337828
29.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1155581
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 138855
12.0%
i 104726
 
9.1%
o 86546
 
7.5%
r 84920
 
7.3%
s 80324
 
7.0%
l 69319
 
6.0%
e 68847
 
6.0%
u 66055
 
5.7%
n 64717
 
5.6%
t 53444
 
4.6%
Other values (42) 337828
29.2%

specificEpithet
Text

Missing 

Distinct46656
Distinct (%)3.0%
Missing353916
Missing (%)18.4%
Memory size14.7 MiB
2025-01-14T11:48:20.461963image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length22
Median length19
Mean length7.826697919
Min length1

Characters and Unicode

Total characters12304704
Distinct characters46
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique13428 ?
Unique (%)0.9%

Sample

1st rowsp.
2nd rowstriata
3rd rowcolumnaris
4th rowsuensonii
5th rowlabrolineata
ValueCountFrequency (%)
sp 198016
 
12.6%
gracilis 6359
 
0.4%
affinis 3601
 
0.2%
fragilis 3504
 
0.2%
elegans 3414
 
0.2%
aculeata 3109
 
0.2%
borealis 2990
 
0.2%
americanus 2825
 
0.2%
grandis 2552
 
0.2%
tenuis 2439
 
0.2%
Other values (46628) 1344736
85.5%
2025-01-14T11:48:20.744247image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 1648406
13.4%
i 1322391
10.7%
s 1208599
9.8%
e 828620
 
6.7%
r 813081
 
6.6%
t 747136
 
6.1%
u 735526
 
6.0%
n 734569
 
6.0%
l 699734
 
5.7%
c 585047
 
4.8%
Other values (36) 2981595
24.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 12104862
98.4%
Other Punctuation 198291
 
1.6%
Space Separator 1400
 
< 0.1%
Dash Punctuation 89
 
< 0.1%
Decimal Number 56
 
< 0.1%
Open Punctuation 3
 
< 0.1%
Close Punctuation 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 1648406
13.6%
i 1322391
10.9%
s 1208599
10.0%
e 828620
 
6.8%
r 813081
 
6.7%
t 747136
 
6.2%
u 735526
 
6.1%
n 734569
 
6.1%
l 699734
 
5.8%
c 585047
 
4.8%
Other values (18) 2781753
23.0%
Other Punctuation
ValueCountFrequency (%)
. 198202
> 99.9%
" 58
 
< 0.1%
' 13
 
< 0.1%
/ 13
 
< 0.1%
, 3
 
< 0.1%
? 2
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 31
55.4%
2 18
32.1%
3 4
 
7.1%
4 1
 
1.8%
5 1
 
1.8%
6 1
 
1.8%
Open Punctuation
ValueCountFrequency (%)
( 2
66.7%
[ 1
33.3%
Close Punctuation
ValueCountFrequency (%)
) 2
66.7%
] 1
33.3%
Space Separator
ValueCountFrequency (%)
1400
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 89
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 12104862
98.4%
Common 199842
 
1.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 1648406
13.6%
i 1322391
10.9%
s 1208599
10.0%
e 828620
 
6.8%
r 813081
 
6.7%
t 747136
 
6.2%
u 735526
 
6.1%
n 734569
 
6.1%
l 699734
 
5.8%
c 585047
 
4.8%
Other values (18) 2781753
23.0%
Common
ValueCountFrequency (%)
. 198202
99.2%
1400
 
0.7%
- 89
 
< 0.1%
" 58
 
< 0.1%
1 31
 
< 0.1%
2 18
 
< 0.1%
' 13
 
< 0.1%
/ 13
 
< 0.1%
3 4
 
< 0.1%
, 3
 
< 0.1%
Other values (8) 11
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12304688
> 99.9%
None 16
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 1648406
13.4%
i 1322391
10.7%
s 1208599
9.8%
e 828620
 
6.7%
r 813081
 
6.6%
t 747136
 
6.1%
u 735526
 
6.0%
n 734569
 
6.0%
l 699734
 
5.7%
c 585047
 
4.8%
Other values (34) 2981579
24.2%
None
ValueCountFrequency (%)
ü 15
93.8%
æ 1
 
6.2%

infraspecificEpithet
Text

Missing 

Distinct6142
Distinct (%)10.4%
Missing1866911
Missing (%)96.9%
Memory size14.7 MiB
2025-01-14T11:48:20.953469image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length31
Median length27
Mean length8.681639899
Min length3

Characters and Unicode

Total characters513519
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2084 ?
Unique (%)3.5%

Sample

1st rowtuberculosa
2nd rowimbricata
3rd rowconnectens
4th rowlaevis
5th rowbonachensis
ValueCountFrequency (%)
acutus 1104
 
1.8%
radiata 638
 
1.1%
bartonii 521
 
0.9%
gibbosus 501
 
0.8%
appressa 444
 
0.7%
modicella 437
 
0.7%
rusticus 389
 
0.6%
campanulata 379
 
0.6%
carinata 372
 
0.6%
minor 370
 
0.6%
Other values (6099) 54802
91.4%
2025-01-14T11:48:21.229013image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 74589
14.5%
i 56561
11.0%
s 47859
9.3%
e 37680
 
7.3%
n 37239
 
7.3%
r 32753
 
6.4%
u 31295
 
6.1%
t 28838
 
5.6%
l 28120
 
5.5%
c 26380
 
5.1%
Other values (23) 112205
21.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 512368
99.8%
Space Separator 807
 
0.2%
Other Punctuation 332
 
0.1%
Dash Punctuation 12
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 74589
14.6%
i 56561
11.0%
s 47859
9.3%
e 37680
 
7.4%
n 37239
 
7.3%
r 32753
 
6.4%
u 31295
 
6.1%
t 28838
 
5.6%
l 28120
 
5.5%
c 26380
 
5.1%
Other values (16) 111054
21.7%
Other Punctuation
ValueCountFrequency (%)
. 313
94.3%
/ 15
 
4.5%
' 2
 
0.6%
? 1
 
0.3%
, 1
 
0.3%
Space Separator
ValueCountFrequency (%)
807
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 12
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 512368
99.8%
Common 1151
 
0.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 74589
14.6%
i 56561
11.0%
s 47859
9.3%
e 37680
 
7.4%
n 37239
 
7.3%
r 32753
 
6.4%
u 31295
 
6.1%
t 28838
 
5.6%
l 28120
 
5.5%
c 26380
 
5.1%
Other values (16) 111054
21.7%
Common
ValueCountFrequency (%)
807
70.1%
. 313
 
27.2%
/ 15
 
1.3%
- 12
 
1.0%
' 2
 
0.2%
? 1
 
0.1%
, 1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 513519
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 74589
14.5%
i 56561
11.0%
s 47859
9.3%
e 37680
 
7.3%
n 37239
 
7.3%
r 32753
 
6.4%
u 31295
 
6.1%
t 28838
 
5.6%
l 28120
 
5.5%
c 26380
 
5.1%
Other values (23) 112205
21.9%

cultivarEpithet
Text

Constant  Missing 

Distinct1
Distinct (%)33.3%
Missing1926058
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:21.285920image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length9
Median length9
Mean length9
Min length9

Characters and Unicode

Total characters27
Distinct characters9
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowGEOLocate
2nd rowGEOLocate
3rd rowGEOLocate
ValueCountFrequency (%)
geolocate 3
100.0%
2025-01-14T11:48:21.386902image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
G 3
11.1%
E 3
11.1%
O 3
11.1%
L 3
11.1%
o 3
11.1%
c 3
11.1%
a 3
11.1%
t 3
11.1%
e 3
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 15
55.6%
Uppercase Letter 12
44.4%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 3
20.0%
c 3
20.0%
a 3
20.0%
t 3
20.0%
e 3
20.0%
Uppercase Letter
ValueCountFrequency (%)
G 3
25.0%
E 3
25.0%
O 3
25.0%
L 3
25.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 27
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
G 3
11.1%
E 3
11.1%
O 3
11.1%
L 3
11.1%
o 3
11.1%
c 3
11.1%
a 3
11.1%
t 3
11.1%
e 3
11.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 27
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
G 3
11.1%
E 3
11.1%
O 3
11.1%
L 3
11.1%
o 3
11.1%
c 3
11.1%
a 3
11.1%
t 3
11.1%
e 3
11.1%

taxonRank
Text

Missing 

Distinct2
Distinct (%)< 0.1%
Missing1866911
Missing (%)96.9%
Memory size14.7 MiB
2025-01-14T11:48:21.433594image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.999847844
Min length7

Characters and Unicode

Total characters591491
Distinct characters12
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowsubspecies
2nd rowsubspecies
3rd rowsubspecies
4th rowsubspecies
5th rowsubspecies
ValueCountFrequency (%)
subspecies 59147
> 99.9%
variety 3
 
< 0.1%
2025-01-14T11:48:21.543769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
s 177441
30.0%
e 118297
20.0%
i 59150
 
10.0%
u 59147
 
10.0%
b 59147
 
10.0%
p 59147
 
10.0%
c 59147
 
10.0%
V 3
 
< 0.1%
a 3
 
< 0.1%
r 3
 
< 0.1%
Other values (2) 6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 591488
> 99.9%
Uppercase Letter 3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
s 177441
30.0%
e 118297
20.0%
i 59150
 
10.0%
u 59147
 
10.0%
b 59147
 
10.0%
p 59147
 
10.0%
c 59147
 
10.0%
a 3
 
< 0.1%
r 3
 
< 0.1%
t 3
 
< 0.1%
Uppercase Letter
ValueCountFrequency (%)
V 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 591491
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
s 177441
30.0%
e 118297
20.0%
i 59150
 
10.0%
u 59147
 
10.0%
b 59147
 
10.0%
p 59147
 
10.0%
c 59147
 
10.0%
V 3
 
< 0.1%
a 3
 
< 0.1%
r 3
 
< 0.1%
Other values (2) 6
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 591491
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
s 177441
30.0%
e 118297
20.0%
i 59150
 
10.0%
u 59147
 
10.0%
b 59147
 
10.0%
p 59147
 
10.0%
c 59147
 
10.0%
V 3
 
< 0.1%
a 3
 
< 0.1%
r 3
 
< 0.1%
Other values (2) 6
 
< 0.1%
Distinct12117
Distinct (%)1.0%
Missing756930
Missing (%)39.3%
Memory size14.7 MiB
2025-01-14T11:48:21.736847image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length69
Median length47
Mean length8.788540377
Min length2

Characters and Unicode

Total characters10274955
Distinct characters88
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2539 ?
Unique (%)0.2%

Sample

1st rowBruguière
2nd row(Duchassaing)
3rd rowLutken
4th rowGaokoin
5th rowFisher
ValueCountFrequency (%)
98247
 
6.8%
linnaeus 78120
 
5.4%
say 43821
 
3.0%
lamarck 28278
 
1.9%
verrill 22061
 
1.5%
stimpson 21858
 
1.5%
gmelin 20022
 
1.4%
dall 17930
 
1.2%
sowerby 15888
 
1.1%
smith 15824
 
1.1%
Other values (7043) 1091668
75.1%
2025-01-14T11:48:22.024281image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 906924
 
8.8%
a 780692
 
7.6%
n 688330
 
6.7%
r 661943
 
6.4%
( 616138
 
6.0%
) 616138
 
6.0%
i 579234
 
5.6%
s 498727
 
4.9%
l 491449
 
4.8%
o 390672
 
3.8%
Other values (78) 4044708
39.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 7284990
70.9%
Uppercase Letter 1336396
 
13.0%
Open Punctuation 616138
 
6.0%
Close Punctuation 616138
 
6.0%
Space Separator 284586
 
2.8%
Other Punctuation 118180
 
1.2%
Dash Punctuation 18527
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 906924
12.4%
a 780692
10.7%
n 688330
9.4%
r 661943
 
9.1%
i 579234
 
8.0%
s 498727
 
6.8%
l 491449
 
6.7%
o 390672
 
5.4%
u 306318
 
4.2%
t 301316
 
4.1%
Other values (40) 1679385
23.1%
Uppercase Letter
ValueCountFrequency (%)
L 176987
13.2%
S 174895
13.1%
M 116679
 
8.7%
B 107118
 
8.0%
H 100415
 
7.5%
C 73945
 
5.5%
D 73228
 
5.5%
G 71058
 
5.3%
R 70187
 
5.3%
P 59533
 
4.5%
Other values (20) 312351
23.4%
Other Punctuation
ValueCountFrequency (%)
& 98246
83.1%
. 12420
 
10.5%
' 7462
 
6.3%
, 52
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
( 616138
100.0%
Close Punctuation
ValueCountFrequency (%)
) 616138
100.0%
Space Separator
ValueCountFrequency (%)
284586
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 18527
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 8621386
83.9%
Common 1653569
 
16.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 906924
 
10.5%
a 780692
 
9.1%
n 688330
 
8.0%
r 661943
 
7.7%
i 579234
 
6.7%
s 498727
 
5.8%
l 491449
 
5.7%
o 390672
 
4.5%
u 306318
 
3.6%
t 301316
 
3.5%
Other values (70) 3015781
35.0%
Common
ValueCountFrequency (%)
( 616138
37.3%
) 616138
37.3%
284586
17.2%
& 98246
 
5.9%
- 18527
 
1.1%
. 12420
 
0.8%
' 7462
 
0.5%
, 52
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 10228371
99.5%
None 46584
 
0.5%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 906924
 
8.9%
a 780692
 
7.6%
n 688330
 
6.7%
r 661943
 
6.5%
( 616138
 
6.0%
) 616138
 
6.0%
i 579234
 
5.7%
s 498727
 
4.9%
l 491449
 
4.8%
o 390672
 
3.8%
Other values (50) 3998124
39.1%
None
ValueCountFrequency (%)
ü 17514
37.6%
è 17194
36.9%
é 4508
 
9.7%
ä 1796
 
3.9%
ö 1657
 
3.6%
ø 1384
 
3.0%
å 620
 
1.3%
Ö 391
 
0.8%
á 269
 
0.6%
ñ 248
 
0.5%
Other values (18) 1003
 
2.2%

nomenclaturalCode
Text

Missing 

Distinct2
Distinct (%)100.0%
Missing1926059
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:22.087818image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length17
Median length15
Mean length15
Min length13

Characters and Unicode

Total characters30
Distinct characters20
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)100.0%

Sample

1st rowVan Cleave, H. J.
2nd rowSchwartz, Ben
ValueCountFrequency (%)
van 1
16.7%
cleave 1
16.7%
h 1
16.7%
j 1
16.7%
schwartz 1
16.7%
ben 1
16.7%
2025-01-14T11:48:22.203954image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
4
 
13.3%
e 3
 
10.0%
a 3
 
10.0%
. 2
 
6.7%
n 2
 
6.7%
, 2
 
6.7%
c 1
 
3.3%
z 1
 
3.3%
t 1
 
3.3%
r 1
 
3.3%
Other values (10) 10
33.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
53.3%
Uppercase Letter 6
 
20.0%
Space Separator 4
 
13.3%
Other Punctuation 4
 
13.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 3
18.8%
a 3
18.8%
n 2
12.5%
c 1
 
6.2%
z 1
 
6.2%
t 1
 
6.2%
r 1
 
6.2%
w 1
 
6.2%
h 1
 
6.2%
v 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
V 1
16.7%
S 1
16.7%
J 1
16.7%
H 1
16.7%
C 1
16.7%
B 1
16.7%
Other Punctuation
ValueCountFrequency (%)
. 2
50.0%
, 2
50.0%
Space Separator
ValueCountFrequency (%)
4
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 22
73.3%
Common 8
 
26.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 3
13.6%
a 3
13.6%
n 2
 
9.1%
c 1
 
4.5%
z 1
 
4.5%
t 1
 
4.5%
r 1
 
4.5%
w 1
 
4.5%
h 1
 
4.5%
V 1
 
4.5%
Other values (7) 7
31.8%
Common
ValueCountFrequency (%)
4
50.0%
. 2
25.0%
, 2
25.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 30
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
4
 
13.3%
e 3
 
10.0%
a 3
 
10.0%
. 2
 
6.7%
n 2
 
6.7%
, 2
 
6.7%
c 1
 
3.3%
z 1
 
3.3%
t 1
 
3.3%
r 1
 
3.3%
Other values (10) 10
33.3%

nomenclaturalStatus
Text

Constant  Missing 

Distinct1
Distinct (%)100.0%
Missing1926060
Missing (%)> 99.9%
Memory size14.7 MiB
2025-01-14T11:48:22.256769image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Length

Max length18
Median length18
Mean length18
Min length18

Characters and Unicode

Total characters18
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)100.0%

Sample

1st rowCamallanus seurati
ValueCountFrequency (%)
camallanus 1
50.0%
seurati 1
50.0%
2025-01-14T11:48:22.362597image/svg+xmlMatplotlib v3.9.4, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 4
22.2%
l 2
11.1%
u 2
11.1%
s 2
11.1%
C 1
 
5.6%
m 1
 
5.6%
n 1
 
5.6%
1
 
5.6%
e 1
 
5.6%
r 1
 
5.6%
Other values (2) 2
11.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 16
88.9%
Uppercase Letter 1
 
5.6%
Space Separator 1
 
5.6%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 4
25.0%
l 2
12.5%
u 2
12.5%
s 2
12.5%
m 1
 
6.2%
n 1
 
6.2%
e 1
 
6.2%
r 1
 
6.2%
t 1
 
6.2%
i 1
 
6.2%
Uppercase Letter
ValueCountFrequency (%)
C 1
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 17
94.4%
Common 1
 
5.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 4
23.5%
l 2
11.8%
u 2
11.8%
s 2
11.8%
C 1
 
5.9%
m 1
 
5.9%
n 1
 
5.9%
e 1
 
5.9%
r 1
 
5.9%
t 1
 
5.9%
Common
ValueCountFrequency (%)
1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 18
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a 4
22.2%
l 2
11.1%
u 2
11.1%
s 2
11.1%
C 1
 
5.6%
m 1
 
5.6%
n 1
 
5.6%
1
 
5.6%
e 1
 
5.6%
r 1
 
5.6%
Other values (2) 2
11.1%